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Preface to the Fourth Edition 


In the Preface to the first edition of this book, published thirty years ago, 
we wrote that our aim was to help the reader to acquire a ‘reasonable under- 
standing of gauge theories that are being tested by contemporary experiments 
in high-energy physics’; and we stressed that our approach was intended to 
be both practical and accessible. 

We have pursued the same aim and approach in later editions. Shortly 
after the appearance of the first edition, a series of major discoveries at the 
CERN pp collider confirmed the existence of the W and Z bosons, with prop- 
erties predicted by the Glashow-Salam-Weinberg electroweak gauge theory; 
and also provided further support for quantum chromodynamics, or QCD. 
Our second edition followed in 1989, expanded so as to include discussion, 
on the experimental side, of the new results; and, on the theoretical side, a 
fuller treatment of QCD, and an elementary introduction to quantum field 
theory, with limited applications. Subsequently, experiments at LEP and 
other laboratories were precise enough to test the Standard Model beyond 
the first order in perturbation theory (‘tree level’), being sensitive to higher 
order effects (‘loops’). In response, we decided it was appropriate to include 
the basics of ‘one-loop physics’. Together with the existing material on rel- 
ativistic quantum mechanics, and QED, this comprised volume 1 (2003) of 
our two-volume third edition. In a natural division, the non-Abelian gauge 
theories of the Standard Model, QCD and the electroweak theory, formed the 
core of volume 2 (2004). The progress of research on QCD, both theoretical 
and experimental, required new chapters on lattice quantum field theory, and 
on the renormalization group. The discussion of the central topic of sponta- 
neous symmetry breaking was extended, in particular so as to include chiral 
symmetry breaking. 

This new fourth edition retains the two-volume format, which has been 
generally well received, with broadly the same allocation of content as in 
the third edition. The principal new additions are, once again, dictated by 
substantial new experimental results — namely, in the areas of CP violation and 
neutrino oscillations, where great progress was made in the first decade of this 
century. Volume 2 now includes a new chapter devoted to CP violation and 
oscillations in mesonic and neutrino systems. Partly by way of preparation for 
this, volume 1 also contains a new chapter, on Lorentz transformations and 
discrete symmetries. We give a simple do-it-yourself treatment of Lorentz 
transformations of Dirac spinors, which the reader can connect to the group 
theory approach in appendix M of volume 2; the transformation properties of 


xiii 


xiv Preface 


bilinear covariants are easily managed. We also introduce Majorana fermions 
at an early stage. This material is suitable for first courses on relativistic 
quantum mechanics, and perhaps should have been included in earlier editions 
(we thank a referee for urging its inclusion now). 

To make room for the new chapter in volume 1, the two introductory 
chapters of the third edition have been condensed into a single one, in the 
knowledge that excellent introductions to the basic facts of particle physics are 
available elsewhere. Otherwise, apart from correcting the known minor errors 
and misprints, the only other changes in volume 1 are some minor improve- 
ments in presentation, and appropriate updates on experimental numbers. 
Volume 2 contains significantly more in the way of updates and additions, as 
will be detailed in the Preface to that volume. But we have continued to omit 
discussion of speculations going beyond the Standard Model; after all, the cru- 
cial symmetry-breaking (Higgs) sector has only now become experimentally 
accessible. 
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The Particles and Forces of the Standard 
Model 


1.1 Introduction: the Standard Model 


The traditional goal of particle physics has been to identify what appear to be 
structureless units of matter and to understand the nature of the forces act- 
ing between them; all other entities are then to be successively constructed as 
composites of these elementary building blocks. The enterprise has a two-fold 
aspect: matter on the one hand, forces on the other. The expectation is that 
the smallest units of matter should interact in the simplest way; or that there 
is a deep connection between the basic units of matter and the basic forces. 
The joint matter/force nature of the enquiry is perfectly illustrated by Thom- 
son’s discovery of the electron and Maxwell’s theory of the electromagnetic 
field, which together mark the birth of modern particle physics. The electron 
was recognized both as the ‘particle of electricity’ — or as we might now say, 
as an elementary source of the electromagnetic field, with its motion consti- 
tuting an electromagnetic current — and also as an important constituent of 
matter. In retrospect, the story of particle physics over the subsequent one 
hundred years or so has consisted in the discovery and study of two new (non- 
electromagnetic) forces — the weak and the strong forces — and in the search 
for ‘electron-figures’ to serve both as constituents of the new layers of matter 
which were uncovered (first nuclei, and then hadrons) and also as sources of 
the new force fields. In the last quarter of the twentieth century, this effort 
culminated in decisive progress: the identification of a collection of matter 
units which are indeed analogous to the electron; and the highly convincing 
experimental verification of theories of the associated strong and weak force 
fields, which incorporate and generalize in a beautiful way the original elec- 
tron/electromagnetic field relationship. These theories are collectively called 
‘the Standard Model’ (or SM for short), to which this book is intended as an 
elementary introduction. 

In brief, the picture is as follows. The matter units are fermions, with 
spin-4 (in units of A). They are of two types, leptons and quarks. Both are 
structureless at the smallest distances currently probed by the highest-energy 
accelerators. The leptons are generalizations of the electron, the term denoting 
particles which, if charged, interact both electromagnetically and weakly; and 
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if neutral, only weakly. By contrast, the quarks — which are the constituents 
of hadrons, and thence of nuclei — interact via all three interactions, strong, 
electromagnetic and weak. The weak and electromagnetic interactions of both 
quarks and leptons are described in a (partially) unified way by the electroweak 
theory of Glashow, Salam and Weinberg (GSW), which is a generalization 
of quantum electrodynamics or QED; the strong interactions of quarks are 
described by quantum chromodynamics or QCD, which is also analogous to 
QED. The similarity with QED lies in the fact that all three interactions are 
types of gauge theories, though realized in different ways. In the first volume 
of this book, we will get as far as QED; QCD and the electroweak theory are 
treated in volume 2. 

The reader will have noticed that the most venerable force of all — gravity 
— is absent from our story. In practical terms this is quite reasonable, since its 
effect is very many orders of magnitude smaller than even the weak force, at 
least until the interparticle separation reaches distances far smaller than those 
we shall be discussing. Conceptually also, gravity still seems to be somewhat 
distinct from the other forces which, as we have already indicated, are encour- 
agingly similar. There are no particular fermionic sources carrying ‘gravity 
charges’: it seems that all matter gravitates. This of course was a motivation 
for Einstein’s geometrical approach to gravity. Despite the lingering promise 
of string theory (Green et al. 1987, Polchinski 1998, Zwiebach 2004), it is 
fair to say that the vision of the unification of all the forces, which possessed 
Einstein, is still some way from realization. Gravitational interactions are not 
part of the SM. 

This book is not intended as a completely self-contained textbook on par- 
ticle physics, which would survey the broad range of observed phenomena and 
outline the main steps by which the picture described here has come to be 
accepted. For this we must refer the reader to other sources (e.g. Perkins 
2000, Bettini 2008). We proceed with a brief review of the matter (fermionic) 
content of the SM. 


ra a 


1.2 The fermions of the Standard Model 
1.2.1 Leptons 


Forty years after Thomson's discovery of the electron, the first member of 
another generation of leptons (as it turned out) — the muon — was found inde- 
pendently by Street and Stevenson (1937), and by Anderson and Neddermeyer 
(1937). Following the convention for the electron, the y” is the particle and 
the put the antiparticle. At first, the muon was identified with the particle 
postulated by Yukawa only two years earlier (1935) as the field quantum of 
the ‘strong nuclear force field’, the exchange of which between two nucleons 
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would account for their interaction (see section 1.3.2). In particular, its mass 
(105.7 MeV) was nicely within the range predicted by Yukawa. However, ex- 
periments by Conversi et al. (1947) established that the muon could not be 
Yukawa’s quantum since it did not interact strongly; it was therefore a lepton. 
The y” seems to behave in exactly the same way as the electron, interacting 
only electromagnetically and weakly, with interaction strengths identical to 
those of an electron. 

In 1975 Perl et al. (1975) discovered yet another ‘replicant’ electron, the 
7 with a mass of 1.78 GeV. Once again, the weak and electromagnetic in- 
teractions of the t~ (7*) are identical to those of the e (e*). 

At this stage one might well wonder whether we are faced with a ‘lepton 
spectroscopy’, of which the e”, u and 7 are but the first three states. Yet 
this seems not to be the correct interpretation. First, no other such states have 
(so far) been seen. Second, all these leptons have the same spin (3), which 
is certainly quite unlike any conventional excitation spectrum. And third, 
no y-transitions are observed to occur between the states, though this would 
normally be expected. For example, the branching fraction for the process 


poate + (not observed) (1.1) 


is currently quoted as less than 1.2 x 1071! at the 90% confidence level 
(Nakamura et al. 2010). Similarly there are (much less stringent) limits on 
T > +yandt >e +7. 

If the e~ and y” states in (1.1) were, in fact, the ground and first excited 
states of some composite system, the decay process (1.1) would be expected 
to occur as an electromagnetic transition, with a relatively high probability 
because of the large energy release. Yet the experimental upper limit on the 
rate is very tiny. In the absence of any mechanism to explain this, one sys- 
tematizes the situation, empirically, by postulating the existence of a selection 
rule forbidding the decay (1.1). In taking this step, it is important to real- 
ize that ‘absolute forbidden-ness’ can never be established experimentally: all 
that can be done is to place a (very small) upper limit on the branching frac- 
tion to the ‘forbidden’ channel, as here. The possibility will always remain 
open that future, more sensitive, experiments will reveal that some processes, 
assumed to be forbidden, are in fact simply extremely rare. 

Of course, such a proposed selection rule would have no physical content if 
it only applied to the one process (1.1); but it turns out to be generally true, 
applying not only to the electromagnetic interaction of the charged leptons, 
but to their weak interactions also. The upshot is that we can consistently 
account for observations (and non-observations) involving e’s, ws and 7's by 
assigning to each a new additive quantum number (called ‘lepton flavour’) 
which is assumed to be conserved. Thus we have electron flavour Le such that 
Le(e”) = 1 and Le(et) = —1; muon flavour L, such that L (u7) = 1 and 
L,,(u*) = —1; and tau flavour L, such that L,(T7) = land L,(r+) = —1. 
Each is postulated to be conserved in all leptonic processes. So (1.1) is then 
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forbidden, the left-hand side having Le = 0 and L, = 1, while the right-hand 
side has Le = 1 and L, = 0. 

The electromagnetic interactions of the mu and the tau leptons are the 
same as for the electron. In weak interactions, each charged lepton (e, ju, T) is 
accompanied by its ‘own’ neutral partner, a neutrino. The one emitted with 
the e” in f-decay was originally introduced by Pauli in 1930, as a ‘desperate 
remedy’ to save the conservation laws of four-momentum and angular momen- 
tum. In the Standard Model, the three neutrinos are assigned lepton flavour 
quantum numbers in such a way as to conserve each lepton flavour separately. 
Thus we assign Le = —1,L = 0, L- = 0 to the neutrino emitted in neutron 
B-decay 

n>p+e +i, (1.2) 
since Le = 0 in the initial state and Le(e-) = +1; so the neutrino in (1.2) is an 
antineutrino ‘of electron type’ (or ‘of electron flavour’). The physical reality 
of the antineutrinos emitted in nuclear -decay was established by Reines and 
collaborators in 1956 (Cowan et al. 1956), by observing that the antineutrinos 
from a nuclear reactor produced positrons via the inverse P-process 


De +p—>n+et. (1.3) 
The neutrino partnering the uy” appears in the decay of the m~: 
T +p +0, (1.4) 


where the D, is an antineutrino of muon type (L (Pp) = —1, Le(D,) = 0 = 
L7(0,,)). How do we know that D, and De are not the same? An important 
experiment by Danby et al. (1962) provided evidence that they are not. They 
found that the neutrinos accompanying muons from 7r-decay always produced 
muons on interacting with matter, never electrons. Thus, for example, the 
lepton flavour conserving reaction 


Ytp>pt+n (1.5) 
was observed, but the lepton flavour violating reaction 
Du tp>et+n (not observed) (1.6) 


was not. As with (1.1), ‘non-observation’ of course means, in practice, an 
upper limit on the cross section. Both types of neutrino occur in the P-decay 
of the muon itself: 
pute +D, (1.7) 

in which L, = 1 is initially carried by the y” and finally by the v,, and the 
Le's of the e” and De cancel each other out. 

In the same way, the v, is associated with the 77, and we have arrived at 
three generations of charged and neutral lepton doublets: 


(Ve, e7) (Vu n) and (vr, T) (1.8) 


together with their antiparticles. 
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TABLE 1.1 
Properties of SM leptons. 


Generation Particle Mass (MeV) Q/e Le Ly Lr 


1 Ve <2 3106 0 1 0 0 
eT 0.511 -1 1 0 0 
2 Vy < 0.19 0 0 1 0 
po 105.658 -1 0 1 0 
3 Vr < 18.2 0 0 0 1 
T 1777 -1 0 0 1 


We should at this point note that another type of weak interaction is 
known, in which — for example — the P, in (1.5) scatters elastically from the 
proton, instead of changing into a u*: 


Dy +p >D, +p. (1.9) 


This is an example of what is called a ‘neutral current’ process, (1.5) being a 
‘charged current’ one. In terms of the Yukawa-like exchange mechanism for 
particle interactions, to be described in the next section, (1.5) proceeds via 
the exchange of charged quanta (WF), while in (1.9) a neutral quantum (ZO) 
is exchanged. 

As well as their flavour, one other property of neutrinos is of great interest, 
namely their mass. As originally postulated by Pauli, the neutrino emitted in 
B-decay had to have very small mass, because the maximum energy carried 
off by the e~ in (1.2) was closely equal to the difference in rest energies of 
the neutron and proton. It was subsequently widely assumed (perhaps largely 
for simplicity) that all neutrinos were strictly massless, and it is fair to say 
that the original Standard Model made this assumption. Yet there is, in fact, 
no convincing reason for this (as there is for the masslessness of the photon 
— see chapter 6), and there is now clear evidence that neutrinos do indeed 
have very small, but non-zero, masses. It turns out that the question of 
neutrino masslessness is directly connected to another one: whether neutrino 
flavour is, in fact, conserved. If neutrinos are massless, as in the original 
Standard Model, neutrinos of different flavour cannot ‘mix’, in the sense of 
quantum-mechanical states; but mixing can occur if neutrinos have mass. The 
phenomenon of neutrino flavour mixing (or ‘neutrino oscillations’) is now well 
established, and is a subject of intense research. In this book we shall simply 
regard non-zero neutrino masses as part of the (updated) Standard Model. 

The SM leptons are listed in table 1.1, along with some relevant properties. 
Note that the limits on the neutrino masses, which are taken from Nakamura 
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et al. 2010, do not include the results obtained from analyses of neutrino 
oscillations. These oscillations, to which we shall return in chapter 21 in 
volume 2, are sensitive to the differences of squared masses of the neutrinos, 
not to the absolute scale of mass. 

We now turn to the other fermions in the SM. 


1.2.2 Quarks 


Quarks are the constituents of hadrons, in which they are bound by the strong 
QCD forces. Hadrons with spins 2, 3, 3, ... (Le. fermions) are baryons, those 
with spins 0, 1, 2, ... (i.e. bosons) are mesons. Examples of baryons are 
nucleons (the neutron n and the proton p), and hyperons such as A% and the 
2 and = states. Evidence for the composite nature of hadrons accumulated 
during the 1960s and 1970s. Elastic scattering of electrons from protons by 
Hofstadter and co-workers (Hofstadter 1963) showed that the proton was not 
pointlike, but had an approximately exponential distribution of charge with a 
root mean square radius of about 0.8 fm. Much careful experimentation in the 
field of baryon and meson spectroscopy revealed sequences of excited states, 
strongly reminiscent of those well-known in atomic and nuclear physics. 

The conclusion would now seem irresistible that such spectra should be 
interpreted as the energy levels of systems of bound constituents. A spe- 
cific proposal along these lines was made in 1964 by Gell-Mann (1964) and 
Zweig (1964). Though based on somewhat different (and much more frag- 
mentary) evidence, their suggestion has turned out to be essentially correct. 
They proposed that baryons contain three spin-4 constituents called quarks 
(by Gell-Mann), while mesons are quark-antiquark systems. One immediate 
consequence is that quarks have fractional electromagnetic charge. For exam- 
ple, the proton has two quarks of charge +3, called ‘up’ (u) quarks, and one 
quark of charge —3, the ‘down’ (d) quark. T he neutron has the combination 
ddu, while the 7+ has one u and one anti-d (d ) and so on. 

Quite simple quantum-mechanical bound state quark models, based on 
these ideas, were remarkably successful in accounting for the observed hadronic 
spectra. Nevertheless, many physicists, in the 1960s and early 1970s, con- 
tinued to regard quarks more as useful devices for systematizing a mass of 
complicated data than as genuine items of physical reality. One reason for 
this scepticism must now be confronted, for it constitutes a major new twist 
in the story of the structure of matter. 

Gell-Mann ended his 1964 paper with the remark: ‘A search for stable 
quarks of charge—3 or +5 and/or stable di-quarks of charge -2 or +4 or 
+3 at the highest energy accelerators would help to reassure us of the non- 
existence of real quarks’. Indeed, with one possible exception (La Rue et al. 
1977, 1981), this ‘reassurance’ has been handsomely provided! Unlike the 
constituents of atoms and nuclei, quarks have not been observed as stable 
isolated particles. When hadrons of the highest energies currently available 
are smashed into each other, what is observed downstream is only lots more 
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hadrons, not fractionally charged quarks. The explanation for this novel be- 
haviour of quarks is now believed to lie in the nature of the interquark force 
(QCD). We shall briefly discuss this force in section 1.3.6, and treat it in detail 
in volume 2. The consensus at present is that QCD does imply the ‘confine- 
ment’ of quarks — that is, they do not exist as isolated single particles!, only 
as groups confined to hadronic volumes. 

When Gell-Mann and Zweig made their proposal, three types of quark 
were enough to account for the observed hadrons: in addition to the u and 
d quarks, the ‘strange’ quark s was needed to describe the known strange 
particles such as the hyperon A? (uds), and the strange mesons like K? (d5). 
In 1964, Bjorken and Glashow (1964) discussed the possible existence of a 
fourth quark on the basis of quark—lepton symmetry, but a strong theoretical 
argument for the existence of the c (‘charm’) quark, within the framework of 
gauge theories of electroweak interactions, was given by Glashow, Iliopoulos 
and Maiani (1970), as we shall discuss in volume 2. They estimated that 
the c quark mass should lie in the range 3-4 GeV. Subsequently, Gaillard 
and Lee (1974) performed a full (one-loop) calculation in the then newly- 
developed renormalizable electroweak theory, and predicted Mme = 1.5 GeV. 
The prediction was spectacularly confirmed in November of the same year with 
the discovery (Aubert et al. 1974, Augustin et al. 1974) of the J/y system, 
which was soon identified as a ct composite (and dubbed ‘charmonium’), with 
a mass in the vicinity of 3 GeV. Subsequently, mesons such as D%(cú) and 
D+ (cd) carrying the c quark were identified (Goldhaber et al. 1976, Peruzzi 
et al. 1976), consolidating this identification. 

The second generation of quarks was completed in 1974, with the two 
quark doublets (u, d) and (c, s) in parallel with the lepton doublets (™,e7) 
and (vu, u`). But even before the discovery of the c quark, the possibility that 
a completely new third-generation quark doublet might exist was raised in a 
remarkable paper by Kobayashi and Maskawa (1973). Their analysis focused 
on the problem of incorporating the known violation of CP symmetry (the 
product? of particle-antiparticle conjugation C and parity P) into the quark 
sector of the renormalizable electroweak theory. CP-violation in the decays 
of neutral K-mesons had been discovered by Christenson et al. (1964), and 
Kobayashi and Maskawa pointed out that it was very difficult to construct a 
plausible model of CP-violation in weak transitions of quarks with only two 
generations. They suggested, however, that CP-violation could be naturally 
accommodated by extending the theory to three generations of quarks. Their 
description of CP-violation thus entailed the very bold prediction of two en- 
tirely new and undiscovered quarks, the (t, b) doublet, where t (‘top’) has 
charge 3 and b (‘bottom’) has charge —3. 

In 1975, with the discovery of the 7” mentioned earlier, there was already 
evidence for a third generation of leptons. The discovery of the b quark 


lWith the (fleeting) exception of the t quark, as we shall see in a moment. 
2We shall discuss these symmetries in chapter 4. 
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in 1977 resulted from the observation of massive mesonic states generally 
known as Y (‘upsilon’) (Herb et al. 1977, Innes et al. 1977), which were 
identified as bb composites. Subsequently, b-carrying mesons were found. 
Finally, firm evidence for the expected t quark was obtained by the CDF and 
DO collaborations at Fermilab in 1995 (Abe et al. 1995, Abachi et al. 1995); 
see Bettini 2008, section 4.10, for details about the discovery of the top quark. 
The full complement of three generations of quark doublets is then 


(ud) (cs) and (tb) (1.10) 


together with their antiparticles, in parallel with the three generations of 
lepton doublets (1.8). 

One particular feature of the t quark requires comment. Its mass is so 
large that, although it decays weakly, the energy release is so great that its 
lifetime is some two orders of magnitude shorter than typical strong interaction 
timescales; this means that it decays before any t-carrying hadrons can be 
formed. So when a t quark is produced (in a p-p collision, for example), 
it decays as a free (unbound) particle. Its mass can be determined from a 
kinematic anaysis of the decay products. 

We must now discuss the quantum numbers carried by quarks. First of 
all, each quark listed in (1.10) comes in three varieties, distinguished by a 
quantum number called ‘colour’. It is precisely this quantum number that 
underlies the dynamics of QCD (see section 1.3.6). Colour, in fact, is a kind 
of generalized charge, for the strong QCD interactions. We shall denote the 
three colours of a quark by ‘red’, ‘blue’, and ‘green’. Thus we have the triplet 
(ur , Up , Ug), and similarly for all the other quarks. 

Secondly, quarks carry flavour quantum numbers, like the leptons. In the 
quark case, they are as follows. The two quarks which are familiar in ordinary 
matter, ‘u’ and ‘d’, are an isospin doublet (see chapter 12 in volume 2) with 
Ta = +1/2 for ‘u’ and T3 = —1/2 for ‘d’. The flavour of ‘s’ is strangeness, 
with the value S = —1. The flavour of ‘c’ is charm, with value C = +1, that 
of ‘b’ has value B = —1 (we use B to distinguish it from baryon number B), 
and the flavour of ‘t’ is T = +1. The convention is that the sign of the flavour 
number is the same as that of the charge. 

The strong and electromagnetic interactions of quarks are independent 
of quark flavour, and depend only on the electromagnetic charge and the 
strong charge, respectively. This means, in particular, that flavour cannot 
change in a strong interaction among hadrons — that is, flavour is conserved 
in such interactions. For example, from a zero strangeness initial state, the 
strong interaction can only produce pairs of strange particles, with cancelling 
strangeness. This is the phenomenon of ‘associated production’, known since 
the early days of strange particle physics in the 1950s. Similar rules hold for 
the other flavours: for example, the t quark, once produced, cannot decay to 
a lighter quark via a strong interaction, since this would violate T-conserva- 
tion. 
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TABLE 1.2 
Properties of SM quarks. 


Generation Particle Mass Qe S C BT 
1 Ur Ub Ug 1.7 to 3.1 MeV 2/3 0 0 0 0 

dr dp dg 4.1 to 5.7 MeV -1/3 0 0 0 0 

2 Cr Cb €g 1.15 to 1.35 GeV 2/3 0 1 0 0 

Sr Sb S 80 to 130 MeV -1/3 -1 0 0 0 


t g 172to174GeV 2/3 0 0 0 1 
b, bp bg 4 to 5 GeV -1/3 0 0 -1 0 


In weak interactions, by contrast, quark flavour is generally not conserved. 
For example, in the semi-leptonic decay 


A°(uds) > p(uud) + e7 + Do, (1.11) 


an s quark changes into a u quark. The rather complicated flavour structure 
of weak interactions, which remains an active field of study, will be reviewed 
when we come to the GSW theory in volume 2. However, one very important, 
though technical, point must be made about the weak interactions of quarks 
and leptons. It is natural to wonder whether a new generation of quarks 
might appear, unaccompanied by the corresponding leptons — or vice versa. 
Within the framework of the Standard Model interactions, the answer is no. 
It turns out that subtle quantum field theory effects called ‘anomalies’, to be 
discussed in chapter 18 of volume 2, would spoil the renormalizability of the 
weak interactions (see section 1.4.1), unless there are equal numbers of quark 
and lepton generations. 

We end this section with some comments about the quark masses; the 
values listed in Table 1.2 are based on those given in Nakamura et al. (2010). 
As we have already noted, the t quark is the only one whose mass can be 
directly measured. All the others are (it would appear) permanently confined 
inside hadrons. It is therefore not immediately obvious how to define — and 
measure — their masses. In a more familiar bound state problem, such as a 
nucleus, the masses of the constituents are those we measure when they are 
free of the nuclear binding forces — i.e. when they are far apart. For the QCD 
force, the situation is very different. There it turns out that the force is very 
weak at short distances, a property called asymptotic freedom — see section 
1.3.6; this important property will be treated in section 15.3 of volume 2. We 
may think of the force as very roughly analogous to that of a spring joining two 
constituents. To separate them, energy must be supplied to the system. So 
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when the constituents are no longer close, the energy of the system is greater 
than the sum of the short distance (free) quark masses. In potential models 
(see section 1.3.6), the effect is least pronounced for the ‘heavy’ quarks (mq 
greater than about 1 GeV). For example, the ground state of the Y (bb) lies at 
about 9.46 GeV, which is close to the average value of 2m, as given in Table 
1.2. For w(ct) the ground state is at about 3 GeV, somewhat greater than 
2m,. For the three lightest quarks, and especially for the u and d quarks, the 
position is quite different: for example, the proton (uud) with a mass of 938 
MeV is far more massive than 2m, + ma. Here the ‘spring’ is responsible for 
about 300 MeV per quark. 

While this picture is qualitatively useful, it is clearly model dependent, 
as would be even a more sophisticated quark model. To do the job properly, 
we have to go to the actual QCD Lagrangian, and use it to calculate the 
hadron masses with the Lagrangian masses as input. This can be done through 
a lattice simulation of the field theory, as will be described in chapter 16 
of volume 2. Independently, another handle on the Lagrangian masses is 
provided by the fact that the QCD Lagrangian has an extra symmetry (‘chiral 
symmetry’) which is exact when the quark masses are zero. This is, in fact, 
an excellent approximation for the u and d quarks, and a fair one for the 
s quark. The symmetry is, however, dynamically (‘spontaneously’) broken 
by QCD, in such a way as to generate (in the case m, = mq = 0) the 
nucleon mass entirely dynamically, along with a massless pion. The small 
Lagrangian masses can then be treated perturbatively in a procedure called 
‘chiral perturbation theory’. These essential features of QCD will be treated 
in chapter 18 of volume 2. For the moment, we accept the values in Table 1.2; 
Nakamura et al. (2010) contains a review of quark masses. 


EE: a 


1.3 Particle interactions in the Standard Model 
1.3.1 Classical and quantum fields 


In the world of the classical physicist, matter and force were clearly separated. 
The nature of matter was intuitive, based on everyday macroscopic experience; 
force, however, was more problematical. Contact forces between bodies were 
easy to understand, but forces which seemed capable of acting at a distance 
caused difficulties. 


That gravity should be innate, inherent and essential to matter, so 
that one body can act upon another at a distance, through a vacuum, 
without the mediation of anything else, by and through which action 
and force may be conveyed from one to the other, is to me so great 
an absurdity, that I believe no man who has in philosophical matters 
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a competent faculty of thinking can ever fall into it. (Letter from 
Newton to Bentley) 


Newton could find no satisfactory mechanism or physical model, for the trans- 
mission of the gravitational force between two distant bodies; but his dynam- 
ical equations provided a powerful predictive framework, given the (unex- 
plained) gravitational force law; and this eventually satisfied most people. 

The 19th century saw the precise formulation of the more intricate force 
laws of electromagnetism. Here too the distaste for action-at-a-distance the- 
ories led to numerous mechanical or fluid mechanical models of the way elec- 
tromagnetic forces (and light) are transmitted. Maxwell made brilliant use 
of such models as he struggled to give physical and mathematical substance 
to Faraday’s empirical ideas about lines of force. Maxwell’s equations were 
indeed widely regarded as describing the mechanical motion of the ether — an 
amazing medium, composed of vortices, gear wheels, idler wheels and so on. 
But in his 1864 paper, the third and final one of the series on lines of force 
and the electromagnetic field, Maxwell himself appeared ready to throw away 
the mechanical scaffolding and let the finished structure of the field equations 
stand on its own. Later these field equations were derived from a Lagrangian 
(see chapter 7), and many physicists came to agree with Poincaré that this 
‘generalized mechanics’ was more satisfactory than a multitude of different 
ether models; after all, the same mathematical equations can describe, when 
suitably interpreted, systems of masses, springs and dampers, or of induc- 
tors, capacitors and resistors. With this step, the concepts of mechanics were 
enlarged to include a new fundamental entity, the electromagnetic field. 

The action-at-a-distance dilemma was solved, since the electromagnetic 
field permeates all of space surrounding charged or magnetic bodies, responds 
locally to them, and itself acts on other distant bodies, propagating the action 
to them at the speed of light: for Maxwell’s theory, besides unifying electricity 
and magnetism, also predicted the existence of electromagnetic waves which 
should travel with the speed of light, as was confirmed by Hertz in 1888. 
Indeed, light was a form of electromagnetic wave. 

Maxwell published his equations for the dynamics of the electromagnetic 
field (Maxwell 1864) some forty years before Einstein’s 1905 paper introducing 
special relativity. But Maxwell’s equations are fully consistent with relativ- 
ity as they stand (see chapter 2), and thus constitute the first relativistic 
(classical) field theory. The Maxwell Lagrangian lives on, as part of QED. 

It seems almost to be implied by the local field concept, and the desire to 
avoid action at a distance, that the fundamental carriers of electricity should 
themselves be point-like, so that the field does not, for example, have to 
interact with different parts of an electron simultaneously. Thus the point- 
like nature of elementary matter units seems intuitively to be tied to the local 
nature of the force field via which they interact. 

Very soon after the successes of classical field physics, however, another 
world began to make its appearance — the quantum one. First the photoelec- 
tric effect and then — much later — the Compton effect showed unmistakeably 
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that electromagnetic waves somehow also had a particle-like aspect, the pho- 
ton. At about the same time, the intuitive understanding of the nature of 
matter began to fail as well: supposedly particle-like things, like electrons, 
displayed wave-like properties (interference and diffraction). Thus the con- 
ceptual distinction between matter and forces, or between particle and field, 
was no longer so clear. On the one hand, electromagnetic forces, treated in 
terms of fields, now had a particle aspect; and on the other hand, particles 
now had a wave-like or field aspect. ‘Electrons’, writes Feynman (1965a) at 
the beginning of volume 3 of his Lectures on Physics, ‘behave just like light’. 

How can we build a theory of electrons and photons which does justice to 
all the ‘point-like’, ‘local’, ‘wave/particle’ ideas just discussed? Consider the 
apparently quite simple process of spontaneous decay of an excited atomic 
state in which a photon is emitted: 


AES AER (1.12) 


Ordinary non-relativistic quantum mechanics cannot provide a first-principles 
account of this process, because the degrees of freedom it normally discusses 
are those of the “matter” units alone — that is, in this example, the electronic 
degrees of freedom. However, it is clear that something has changed radi- 
cally in the field degrees of freedom. On the left-hand side, the matter is in 
an excited state and the electromagnetic field is somehow not manifest; on 
the right, the matter has made a transition to a lower-energy state and the 
energy difference has gone into creating a quantum of electromagnetic radia- 
tion. What is needed here is a quantum theory of the electromagnetic field — 
a quantum field theory. 

Quantum field theory — or qft for short — is the fundamental formal and 
conceptual framework of the Standard Model. An important purpose of this 
book is to make this core twentieth century formalism more generally accessi- 
ble. In chapter 5 we give a step-by-step introduction to qft. We shall see that 
a free classical field — which has infinitely many degrees of freedom — can be 
thought of as mathematically analogous to a vibrating solid (which has merely 
a very large number). The way this works mathematically is that the Fourier 
components of the field act like independent harmonic oscillators, just like the 
vibrational ‘normal modes’ of the solid. When quantum mechanics is applied 
to this system, the energy eigenstates of each oscillator are quantized in the 
familiar way, as (n, +1/2)hw, for each oscillator of frequency wp: we say that 
such states contain ‘n, quanta of frequency wr’. The state of the entire field 
is characterized by how many quanta of each frequency are present. These 
‘excitation quanta’ are the particle aspect of the field. In the ground state 
there are no excitations present — no field quanta — and so that is the vacuum 
state of the field. 

In the case of the electromagnetic field, these quanta are of course photons 
(for the solid, they are phonons). In the process (1.12) the electromagnetic 
field was originally in its ground (no photon) state, and was raised finally to an 
excited state by the transfer of energy from the electronic degrees of freedom. 
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The final excited field state is defined by the presence of one quantum (photon) 
of the appropriate energy. 

We obviously cannot stop here (‘Electrons behave just like light’). All the 
particles of the SM must be described as excitation quanta of the correspond- 
ing quantum fields. But of course Feynman was somewhat overstating the 
case. The quanta of the electromagnetic field are bosons, and there is no limit 
on the number of them that can occupy a single quantum state. By contrast, 
the quanta of the electron field, for example, must be fermions, obeying the 
exclusion principle. In chapter 7 we shall see what modifications to the quan- 
tization procedure this requires. We must also introduce interactions between 
the excitation quanta, or equivalently between the quantum fields. This we 
do in chapter 6 for bosonic fields, and in chapter 7 for the Dirac and Maxwell 
fields thereby arriving at QED, our first quantum gauge field theory of the 
SM. 

One reason the Lagrangian formulation of classical field (or particle) physics 
is so powerful is that symmetries can be efficiently incorporated, and their con- 
nection with conservation laws easily exhibited. The same is even more true 
in qft. For example, only in qft can the symmetry corresponding to electric 
charge conservation be simply understood. Indeed, all the quantum gauge 
field theories of the SM are deeply related to symmetries, as will become clear 
in the subsequent development. 

In some cases, however, the symmetry — though manifest in the Lagrangian 
— is not visible in the usual empirical ways (conservation laws, particle multi- 
plets, and so on). Instead, it is ‘spontaneously (or dynamically) broken’. This 
phenomenon plays a crucial role in both QCD and the GSW theory. An aid to 
understanding it physically is provided by the analogy between the vacuum 
state of an interacting qft and the ground state of an interacting quantum 
many-body system — an insight due to Nambu (1960). We give an extended 
discussion of spontaneously broken symmetry in Part VII of volume 2. We 
shall see how the neutral bosonic (Bogoliubov) superfluid, and the charged 
fermionic (BCS) superconductor, offer instructive working models of dynami- 
cal symmetry breaking, relevant to chiral symmetry breaking in QCD, and to 
the generation of gauge boson masses in the GSW theory. 

The road ahead is a long one, and we begin our journey at a more descrip- 
tive and pictorial level, making essential use of Yukawa’s remarkable insight 
into the quantum nature of force. In due course, in chapter 6, we shall be- 
gin to see how qft supplies the precise mathematical formulae associated with 
such pictures. 


1.3.2 The Yukawa theory of force as virtual quantum 
exchange 


Yukawa’s revolutionary paper (Yukawa 1935) proposed a theory of the strong 
interaction between a proton and a neutron, and also considered its possible 
extension to neutron P-decay. He built his theory by analogy with electromag- 
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netism, postulating a new field of force with an associated new field quantum, 
analogous to the photon. In doing so, he showed with particular clarity how, 
in quantum field theory, particles interact by exchanging virtual quanta, which 
mediate the force. 

Before proceeding, we should emphasize that we are not presenting Yukawa’s 
ideas as a viable candidate theory of strong and weak interactions. Crucially, 
Yukawa assumed that the nucleons and his quantum (later identified with the 
pion) were point-like, but in fact both nucleons and pions are quark compos- 
ites with spatial extension. The true ‘strong’ interaction relates to the quarks, 
as we shall see in section 1.3.6. There are also other details of his theory which 
were (we now know) mistaken, as we shall discuss. Yet his approach was pro- 
found, and — as happens often in physics — even though the initial application 
was ultimately superseded, the ideas have broad and lasting validity. 

Yukawa began by considering what kind of static potential might describe 
the n-p interaction. It was known that this interaction decreased rapidly 
for interparticle separation r > 2 fm. Hence, the potential could not be of 
coulombic type x 1/r. Instead, Yukawa postulated an n-p potential energy 
of the form 
_ —ga e—r/a 


U(r) (1.13) 


Ar r 


where ‘gn’ is a constant analogous to the electric charge e, r = |r| and ‘a’ is 
a range parameter (~ 2 fm). This static potential satisfies the equation 


(v — =) U(r) = gh d(r) (1.14) 


(see appendix G) showing that it may be interpreted as the mutual potential 
energy of one point-like test nucleon of ‘strong charge’ gn due to the presence 
of another point-like nucleon of equal charge gn at the origin, a distance r 
away. Equation (1.14) should be thought of as a finite range analogue of 
Poisson’s equation in electrostatics (equation (G.3)) 


V*V(r) = =p(r)/c0, (1.15) 


the delta function in (1.14) (see appendix E) expressing the fact that the 
‘strong charge density’ acting as the source of the field is all concentrated into 
a single point, at the origin. 

Yukawa now sought to generalize (1.14) to the non-static case, so as to 
obtain a field equation for U(r,t). For r 4 0, he proposed the free-space 
equation (we shall keep factors of c and fi explicit for the moment) 


which is certainly relativistically invariant (see appendix D). Thus far, U is 
still a classical field. Now Yukawa took the decisive step of treating U quantum 
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mechanically, by looking for a (de Broglie-type) propagating wave solution of 
(1.16), namely 
U ax exp(ip: r/h —iEt/h). (1.17) 


Inserting (1.17) into (1.16) one finds 


= + (1.18) 


or, taking the positive square root, 
2727 1/2 
ch 
B= jer + = . 


Comparing this with the standard E—p relation for a massive particle in spe- 
cial relativity (appendix D), the fundamental conclusion is reached that the 
quantum of the finite-range force field U has a mass my given by 

24 eR? h 


= =—. 1.19 
myc z2 or m=z (1.19) 


This means that the range parameter in (1.13) is related to the mass of the 
quantum my by 


h 
myc 


a= (1.20) 
Inserting a = 2 fm gives my = 100 MeV, Yukawa’s famous prediction for the 
mass of the nuclear force quantum. 

Next, Yukawa envisaged that the U-quantum would be emitted in the 
transition n — p, via a process analogous to (1.12): 


np U7 (1.21) 


where charge conservation determines the U~ charge. Yet there is an obvious 
difference between (1.21) and (1.12): (1.21) violates energy conservation since 
My <Mpt+my if my = 100 MeV, so it cannot occur as a real emission process. 
However, Yukawa noted that if (1.21) were combined with the inverse process 


p+U >n (1.22) 


then an n-p interaction could take place by the mechanism shown in fig- 
ure 1.1(a); namely, by the emission and subsequent absorption — that is, by 
the exchange — of a UT quantum. He also included the corresponding U+ 
exchange, where Ut is the antiparticle of the UT, as shown in figure 1.1(b). 

An energy-violating transition such as (1.21) is known as a ‘virtual’ transi- 
tion in quantum mechanics. Such transitions are routinely present in quantum- 
mechanical time-dependent perturbation theory and can be understood in 
terms of an ‘energy—time uncertainty relation’ 


AEAt > h/2. (1.23) 
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FIGURE 1.1 
Yukawa’s single-U exchange mechanism for the n-p interaction. (a) UT ex- 
change. (b) U* exchange. 


The relation (1.23) may be interpreted as follows (we abridge the careful 
discussion in section 44 of Landau and Lifshitz (1977)). Imagine an ‘energy- 
measuring device’ set up to measure the energy of a quantum system. To do 
this, the device must interact with the quantum system for a certain length of 
time At. If the energy of a sequence of identically prepared quantum systems 
is measured, only in the limit At — oo will the same energy be obtained 
each time. For finite At, the measured energies will necessarily fluctuate by 
an amount AF as given by (1.23); in particular, the shorter the time over 
which the energy measurement takes place, the larger the fluctuations in the 
measured energy. 

Wick (1938) applied (1.23) to Yukawa’s theory, and thereby shed new light 
on the relation (1.20). Suppose a device is set up capable of checking to see 
whether energy is, in fact, conserved while the U* crosses over in figure 1.1. 
The crossing time t must be at least r/c, where r is the distance apart of the 
nucleons. However, the device must be capable of operating on a time scale 
smaller than t (otherwise it will not be in a position to detect the U~), but 
it need not be very much less than this. Thus the energy uncertainty in the 
reading by the device will be? 


hie (1.24) 


r 


As r decreases, the uncertainty AE in the measured energy increases. If we 


3In this kind of argument, the ‘~’ sign should be understood as meaning that numerical 
factors of order 1 (such as 2 or rr) are not important. The coincidence between (1.25) and 
(1.20) should not be taken too literally. Nevertheless, the physics of (1.25) is qualitatively 
correct. 
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FIGURE 1.2 
Scattering by a static point-like U-source. 


require AE = myc’, then 
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ro 1.25) 
just as in (1.20). The ‘r’ in (1.25) is the extent of the separation allowed 
between the n and the p, such that — in the time available — the UF can 
‘borrow’ the necessary energy to come into existence and cross from one to 
the other. In this sense, r is the effective range of the associated force, as in 
(1.20). 

Despite the similarity to virtual intermediate states in ordinary quantum 
mechanics, the Yukawa—Wick process is nevertheless truly revolutionary be- 
cause it postulated an energy fluctuation AF great enough to create an as yet 
unseen new particle, a new state of matter. 

We proceed to explore further aspects of Yukawa’s force mechanism. The 
reader should note that throughout the remainder of this book we shall gener- 
ally (unless otherwise stated) use units such that î = c = 1: see Appendix B. 


1.3.3 The one-quantum exchange amplitude 


Consider a particle, carrying ‘strong charge’ gn, being scattered by an in- 
finitely massive (static) point-like U-source also of ‘charge’ gy as pictured in 
figure 1.2. From the previous section, we know that the potential energy in 
the Schródinger equation for the scattered particle is precisely the U(r) from 
(1.13). Treating this to its lowest order in U(r) (‘Born Approximation’ — see 


appendix H), the scattering amplitude is proportional to the Fourier transform 
of U(r): 


f(a) = Jaro dir (1.26) 


where q is the momentum (or wavevector, since h = 1) transfer q = k — k’. 
The transform is evaluated in appendix G equation (G.24), or in problem 1.1, 
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with the result 
2 


__ ___ IN 
fa) = era (1.27) 


This implies that the amplitude (in this static case) for the one-U exchange 
amplitude is proportional to —1/(q2+m%), where q is the momentum carried 
by the U-quantum. 

In this scattering by an infinitely massive source of potential, the energy 
of the scattered particle cannot change. In a real scattering process such as 
that in figure 1.1, both energy and momentum can be transferred by the U- 
quantum — that is, q is replaced by the four-momentum q = (qo, q), where 
do = ko — kj. Then, as indicated in appendix G, the factor —1/(q? + m?) is 
replaced by 1/(q? — mg) and the amplitude for figure 1.1 is, in this model, 


2 
N (1.28) 


q2 — m? 


It will be the main burden of chapters 5 and 6 to demonstrate just how 
this formula is arrived at, using the formalism of quantum field theory. In 
particular, we shall see in detail how the propagator (q? — mj)~' arises. For 
the present, we can already note (from appendix G) that such propagators 
are, in fact, momentum-space Green functions. 

In chapter 6 we shall also discuss other aspects of the physical meaning of 
the propagator, and we shall see how diagrams which we have begun to draw 
in a merely descriptive way become true ‘Feynman diagrams’, each diagram 
representing by a precise mathematical correspondence a specific expression 
for a quantum amplitude, as calculated in perturbation theory. The expansion 
parameter of this perturbation theory is the dimensionless number gz,/47 
appearing in the potential U(r) (cf (1.13)). In terms of Feynman diagrams, 
we shall learn in chapter 6 that one power of gy is to be associated with each 
‘vertex’ at which a U-quantum is emitted or absorbed. Thus successive terms 
in the perturbation expansion correspond to exchanges of more and more 
quanta. Quantities such as gyn are called ‘coupling strengths’, or ‘coupling 
constants’. 

It is not too early to emphasize one very important point to the reader: true 
Feynman diagrams are representations of momentum-space amplitudes. They 
are not representations of space-time processes: all space-time points are 
integrated over in arriving at the formula represented by a Feynman diagram. 
In particular, the two ‘intuitive’ diagrams of figure 1.1, which carry an implied 
‘time-ordering’ (with time increasing to the right), are both included in a single 
Feynman diagram with propagator (1.28), as we shall see in detail (for an 
analogous case) in section 7.1. 

We now indicate how these general ideas of Yukawa apply to the actual 
interactions of quarks and leptons. 
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FIGURE 1.3 
One photon exchange mechanism between charged leptons. 


1.3.4 Electromagnetic interactions 


From the foregoing viewpoint, electromagnetic interactions are essentially a 
special case of Yukawa’s picture, in which gz is replaced by the appropriate 
electromagnetic charges, and my — m. = 0 so that a — oo and the potential 
(1.13) returns to the Coulomb one, —e?/41r. A typical one-photon exchange 
scattering process is shown in figure 1.3, for which the generic amplitude (1.28) 
becomes 

e2/Q2. (1.29) 


Note that we have drawn the photon line ‘vertically’, consistent with the 
fact that both time-orderings of the type shown in figure 1.1 are included in 
(1.29). In the case of electromagnetic interactions, the coupling strength is e 
and the expansion parameter of perturbation theory is e?/4r = a ~ 1/137 
(see appendix C). 

We can immediately use (1.29) to understand the famous ~ sin~* 6/2 an- 
gular variation of Rutherford scattering. Treating the target muon as infinitely 
heavy (so as to simplify the kinematics), the electron scatters elastically so 
that qo = 0 and q? = —(k — k’)? where k and k’ are the incident and fi- 
nal electron momenta. So q? = —2k*(1 — cos0) = —4k? sin? 0/2 where we 
have used the elastic scattering condition k? = k”. Inserting this into (1.29) 
and remembering that the cross section is proportional to the square of the 
amplitude (appendix H) we obtain the distribution sin~* 0/2. Thus, such a 
distribution is a clear signature that the scattering is proceeding via the ex- 
change of a massless quantum. 

Unfortunately, the detailed implementation of these ideas to the electro- 
magnetic interactions of quarks and leptons is complicated, because the elec- 
tromagnetic potentials are the components of a 4-vector (see chapter 2), rather 
than a scalar as in (1.29), and the quarks and leptons all have spin-2, necessi- 
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FIGURE 1.4 
Yukawa's U-exchange mechanism for neutron P-decay. 


tating the use of the Dirac equation (chapter 3). Nevertheless, (1.29) remains 
the essential ‘core’ of electromagnetic amplitudes. 

As far as the electromagnetic field is concerned, its 4-vector nature is ac- 
tually a fundamental feature, having to do with a symmetry called gauge 
invariance, or (better) local phase invariance. As we shall see in chapters 2 
and 7, the form of the electromagnetic interaction is very strongly constrained 
by this symmetry. In fact, turning the argument around, one can (almost) 
understand the necessity of electromagnetic interactions as being due to the re- 
quirement of gauge invariance. Most significantly, we shall see in section 7.3.1 
how the masslessness of the photon is also related to gauge invariance. 

In chapter 8 a number of elementary electromagnetic processes will be fully 
analysed, and in chapter 11 we shall discuss higher-order corrections in QED. 


1.3.5 Weak interactions 


In a bold extension of his ‘strong force’ idea, Yukawa extended his theory 
to describe neutron P-decay as well, via the hypothesized process shown in 
figure 1.4 (here and in figure 1.5 we revert to the more intuitive ‘time-ordered’ 
picture — the reader may supply the diagrams corresponding to the other time- 
ordering). As indicated on the diagram, Yukawa assigned the strong charge 
gn at the n-p end, and a different ‘weak’ charge g’ at the lepton end. Thus 
the same quantum mediated both strong and weak transitions, and he had 
an embryonic ‘unified theory’ of strong and weak processes! If we take UT 
to be the r~, Yukawa’s mechanism predicts the existence of the weak decay 
Te +e. 

This decay does indeed occur, though at a much smaller rate than the main 
mode which is 7~ > u~ +0,. But — apart from the now familiar problem with 
the compositeness of the nucleons and pions — this kind of unification is not 
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FIGURE 1.5 
(a) B-decay and (b) et emission at the quark level, mediated by WF. 


chosen by Nature. Not unreasonably in 1935, Yukawa was assuming that the 
range ~ mg of the strong force in n-p scattering (figure 1.1) was the same 
as that of the weak force in neutron P-decay (figure 1.4); after all, the latter 
(and more especially positron emission) was viewed as a nuclear process. But 
this is now known not to be the case: in fact, the range of the weak force 
is much smaller than nuclear dimensions — or, equivalently (see (1.19)), the 
masses of the mediating quanta are much greater than that of the pion. 

P-decay is now understood as occurring at the quark level via the W7- 
exchange process shown in figure 1.5(a). Similarly, positron emission proceeds 
via figure 1.5(b). Other ‘charged current’ processes all involve W+-exchange, 
generalized appropriately to include flavour mixing effects (see volume 2). 
‘Neutral current’ processes involve exchange of the Z°-quantum; an example 
is given in figure 1.6. The quanta W*, Z° therefore mediate these weak inter- 
actions as does the photon for the electromagnetic one. Like the photon, the 
W and Z fields are the quanta of 4-vector fieldstand have spin 1, but unlike the 
photon, the masses of the W and Z are far from zero — in fact Mw ~ 80 GeV 
and Mz = 91 GeV. So the range of the force is ~ My ~ 2.5 x 10718 m, much 
less than typical nuclear dimensions (~ few x10718 m). This, indeed, is one 
way of understanding why the weak interactions appear to be so weak: this 
range is so tiny that only a small part of the hadronic volume is affected. 

Thus Nature has not chosen to unify the strong and weak forces via a 
common mediating quantum. Instead, it has turned out that the weak and 
strong forces (see section 1.3.6) are both gauge theories, generalizations of 
electromagnetism, as will be discussed in volume 2. This raises the possibility 
that it may be possible to ‘unify’ all three forces. 


4This is dictated by the phenomenology of weak interactions — see chapter 20 in volume 
2. 
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FIGURE 1.6 
Z°-exchange process. 


Some initial idea of how this works in the ‘electroweak’ case may be gained 
by considering the amplitude for figure 1.5(a) in the low —q? limit. In a 
simplified version analogous to (1.29) which ignores the spin of the W and of 
the leptons, this amplitude is 


9° /(a? — My) (1.30) 


where g is a ‘weak charge’ associated with W-emission and absorption. In 
actual 6-decay, the square of the 4-momentum transfer q? is tiny compared to 
Mi, so that (1.30) becomes independent of q? and takes the constant value 
—g’/M¥,. This corresponds, in configuration space, to a point-like interaction 
(the Fourier transform of a delta function is a constant). Just such a point- 
like interaction, shown in figure 1.7, had been postulated by Fermi (1934a, b) 
in the first theory of P-decay: it is a ‘four-fermion’ interaction with strength 
Gr. The value of Gp can be determined from measured P-decay rates. The 
dimensions of Gp turn out to be energy x volume, so that G/(hc)? has 
dimension (energy 2). In our units h = c = 1, the numerical value of Gp is 


Gr ~ (300 GeV) 2. (1.31) 
If we identify this constant with g?/M¢?, we obtain 
g? ~ Miy/(300 GeV)? ~ 0.064 (1.32) 


a value quite similar to that of the electromagnetic charge e? as determined 
from e? = 4ra ~ 0.09. Though this is qualitatively correct, we shall see 
in volume 2 that the actual relation, in the electroweak theory, between the 
weak and electromagnetic coupling strengths is somewhat more complicated 
than the simple equality ‘g = e’. (Note that a corresponding connection with 
Fermi’s theory was also made by Yukawa!) 
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FIGURE 1.7 
Point-like four-fermion interaction. 


We can now understand the ‘weakness’ of the weak interactions from an- 
other viewpoint. For q? < M¥,, the ratio of the electromagnetic amplitude 
(1.29) to the weak amplitude (1.30) is of order q?/M¢,, given that e ~ g. 
Thus despite having an intrinsic strength similar to that of electromagnetism, 
weak interactions will appear very weak at low energies such that q? < My. 
At energies approaching My, however, weak interactions will grow in im- 
portance relative to electromagnetic ones and, when q? > MR, weak and 
electromagnetic interactions will contribute roughly equally. 

‘Similar’ coupling strengths are still not ‘unified’, however. True unifi- 
cation only occurs after a more subtle effect has been included, which goes 
beyond the one-quantum exchange mechanism. This is the variation or ‘run- 
ning’ of the coupling strengths as a function of energy (or distance), caused 
by higher-order processes in perturbation theory. This will be discussed more 
fully in chapter 11 for QED, and in volume 2 for the other gauge couplings. 
It turns out that the possibility of unification depends crucially on an impor- 
tant difference between the weak interaction quanta W= (to take the present 
example) and the photons of QED, which has not been apparent in the simple 
B-decay processes considered so far. The W’s are themselves ‘weakly charged’, 
acting as both carriers and sources of the weak force field, and they therefore 
interact directly amongst themselves even in the absence of other matter. 
By contrast, photons are electromagnetically neutral and have no direct self- 
interactions. In theories where the gauge quanta self-interact, the coupling 
strength decreases as the energy increases, while for QED it increases. It is 
this differing ‘evolution’ that tends to bring the strengths together, ultimately. 


Even granted similar coupling strengths and the fact that both are 4-vector 
fields, the idea of any electroweak unification appears to founder immediately 
on the markedly different ranges of the two forces or, equivalently, of the 
masses of the mediating quanta (m, = 0, Mw ~ 80 GeV!). This difficulty 
becomes even more pointed when we recall that, as previously mentioned, 
the masslessness of the photon is related to gauge invariance in electrody- 
namics: how then can there be any similar kind of gauge symmetry for weak 
interactions, given the distinctly non-zero masses of the mediating quanta? 
Nevertheless, in one of the great triumphs of 20th century theoretical physics, 
it is possible to see the two theories as essentially similar gauge theories, the 
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gauge symmetry being ‘spontaneously broken’ in the case of weak interac- 
tions. This is a central feature of the GSW electroweak theory. An indication 
of how gauge quanta might acquire mass will be given in section 11.4 but a 
fuller explanation, with application to the electroweak theory, is reserved for 
volume 2. We will have a few more words to say about it in section 1.4.1. 


1.3.6 Strong interactions 


We turn to the contemporary version of Yukawa’s theory of strong interac- 
tions, now viewed as occurring between quarks rather than nucleons. Evidence 
that the strong interquark force is in some way similar to QED comes from 
nucleon-nucleon (or nucleon-antinucleon) collisions. Regarding the nucleons 
as composites of point-like quarks, we would expect to see prominent events at 
large scattering angles corresponding to ‘hard’ q-q collisions (recall Ruther- 
ford’s discovery of the nucleus). Now the result of such a hard collision would 
normally be to scatter the quarks to wide angles, ‘breaking up’ the nucleons 
in the process. However, quarks (except for the t quark) are not observed 
as free particles. Instead, what appears to happen is that, as the two quarks 
separate from each other, their mutual potential energy increases — so much so 
that, at a certain stage in the evolution of the scattering process, the energy 
stored in the potential converts into a new qq pair. This process continues, 
with in general many pairs being produced as the original and subsequent 
pairs pull apart. By a mechanism which is still not quantitatively understood 
in detail, the produced quarks and antiquarks (and the original quarks in the 
nucleons) bind themselves into hadrons within an interaction volume of order 
1 fm, so that no free quarks are finally observed, consistent with ‘confine- 
ment’. Very strikingly, these hadrons emerge in quite well-collimated ‘jets’, 
suggesting rather vividly their ancestry in the original separating qq pair. 
Suppose, then, that we plot the angular distribution of such ‘two jet events’: 
it should tell us about the dynamics of the original interaction at the quark 
level. 


Figure 1.8 shows such an angular distribution from proton—antiproton scat- 
tering, so that the fundamental interaction in this case is the elastic scattering 
process qq — qq. Here @ is the scattering angle in the qq centre of mass system 
(CMS). Amazingly, the 6-distribution follows almost exactly the ‘Rutherford’ 
form sin”* 0/2. 

We saw how, in the Coulomb case, this distribution could be understood 
as arising from the propagator factor 1/q?, which itself comes from the 1/r 
potential associated with the massless quantum involved, namely the photon. 
In the present case, the same 1/q? factor is responsible: here, in the qq centre 
of mass system, k and —k are the momenta of the initial q and q, while k’ and 
—k' are the corresponding final momenta. Once again, for elastic scattering 
there is no energy transfer, and q? = —q? = —(k — k’)? = —4k? sin? 0/2 as 
before, leading to the sin”* 0/2 form on squaring 1/q?. Once again, such a 
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FIGURE 1.8 

Angular distribution of two-jet events in pp collisions (Arnison et al. 1985) 
as a function of cos 0, where @ is the CMS scattering angle. The broken curve 
is the prediction of QCD, obtained in the lowest order of perturbation theory 
(one-gluon exchange); it is virtually indistinguishable from the Rutherford 
(one-photon exchange) shape sin” 4/2. The full curve includes higher order 
QCD corrections. 


distribution is a clear signal that a massless quantum is being exchanged — in 
this case, the gluon. 

It might then seem to follow that, as in the case of QED, the QCD inter- 
action has infinite range. But this cannot be right: the strong forces do not 
extend beyond the size of a typical hadron, which is roughly 1 fm. Indeed, the 
QCD force is mediated by the massless spin-1 gluon, and QCD is also a gauge 
theory; but the form of the QCD interaction, though somewhat analogous to 
QED, is more complicated, and the long range behaviour of the force is very 
different. 

As we have seen, each quark comes in three colours, and the QCD force 
is sensitive to this colour label: the gluons effectively ‘carry colour’ back and 
forth between the quarks, as shown in the one-gluon exchange process of fig- 
ure 1.9. Because the gluons carry colour, they can interact with themselves, 
like the W’s and Z’s of the GSW theory. As in that case, these gluonic 
self-interactions cause the QCD interaction strength to decrease at short dis- 
tances (or high energies), ultimately tending to zero, the property known as 
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FIGURE 1.9 

Strong scattering via gluon exchange. At the top vertex, the ‘flow’ of colour is 
b (quark) — r (quark) + rb (gluon); at the lower vertex the flow is rb (gluon) 
+ r (quark) > b (quark). 


asymptotic freedom. So in ‘hard’ collisions occurring at short inter-particle 
distances, the one-gluon exchange mechanism gives a good first approxima- 
tion to the data. But the force grows much stronger as the quarks separate 
from each other, and perturbation theory is no longer a reliable guide. In 
fact, it seems that a new, non-perturbative, effect occurs — namely confine- 
ment. Once again, a gauge theory, with formal similarity to QED, has very 
different physical consequences. 
A phenomenological qq (or qq) potential which is often used in quark 
models has the form m 
V= =z tbr (1.33) 


where the first term, which dominates at small r, arises from a single-gluon 
exchange so that a ~ g2, where the strong (QCD) charge is gs. The second 
term models confinement at larger values of r. Such a potential provides 
quite a good understanding of the gross structure of the ce and bb systems 
(see problem 1.5). A typical value for bis 0.85 GeV fm”! (which corresponds 
to a constant force of about 14 tonnes!). Thus at r ~ 2 fm, there is enough 
energy stored to produce a pair of the lighter quarks. This ‘linear’ part of 
the potential cannot be obtained by considering the exchange of one, or even 
a finite number of, gluons: in other words, not within an approach based on 
perturbation theory. 

It is interesting to note that the linear part of the potential may be re- 
garded as the solution of the one-dimensional form of V*V = 0, namely 
d?V/dr? = 0; this is in contrast to the Coulombic 1/r part, which is a solu- 
tion (except at r = 0) to the full three-dimensional Laplace equation. This 
suggests that the colour field lines connecting two colour charges spread out 
into all of space when the charges are close to each other, but are somehow 
‘squeezed’ into an elongated one-dimensional ‘string’ as the distance between 
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the charges becomes greater than about 1 fm. In the second volume, we shall 
see that numerical simulations of QCD, in which the space-time continuum is 
represented as a discrete lattice of points, indicate that such a linear potential 
does arise when QCD is treated non-perturbatively. It remains a challenge 
for theory to demonstrate that confinement follows from QCD. 

It is believed that gluons too are confined by QCD, so that — like quarks 
— they are not seen as isolated free particles. But they too ‘hadronize’ after 
being produced in a primitive short-distance collision process, as happens in 
the case of q’s and q’s. Such ‘gluon jets’ provide indirect evidence for the 
existence and properties of gluons, as we shall see in volume 2. 

This is an appropriate moment at which to emphasize what appears to 
be a crucial distinction between the three ‘charges’ (electromagnetic, weak 
and strong) on the one hand, and the various flavour quantum numbers on 
the other. The former have a dynamical significance, whereas the latter do 
not. In the case of electric charge, for example, this means simply that a 
particle carrying this property responds in a definite way to the presence of 
an electromagnetic field and itself creates such a field. No such force fields are 
known for any of the flavour numbers, which are (at present) purely empirical 
classification devices, without dynamical significance. 


1.3.7 The gauge bosons of the Standard Model 


We can now gather together the mediators of the SM forces. They are all gauge 
bosons, meaning that they are the quanta of various 4-vector gauge fields. For 
example, the photon is the quantum of the electromagnetic (Maxwell) 4-vector 
potential A” (x) (see chapter 2 and section 6.3.1), which is the simplest gauge 
field. The gluon is the quantum of the QCD potential A# (x), where the colour 
index a runs from 1 to 8. The reason there are 8 of them may be guessed 
from figure 1.9: each gluon can be thought of as carrying one colour-anticolour 
combination, such as Tb, bg, and so on; the symmetric combination Tr +bb 
+g is totally colourless and is discarded (see section 12.2 in volume 2). In 
the GSW electroweak theory, there are four gauge fields, W}'(x) where i runs 
from 1 to 3, and B*(x) which is analogous to A“(x). One linear combination 
of W4' (x) and B*(x) is associated with the photon field A“ (x); the orthogonal 
combination is associated with the Z” (x) field whose quantum is the ZO. The 
charged carriers W* are associated with the W# (x) and WẸ (x) components 
of the W*(x) field. 

We shall assume that the mass of the photon and of the gluon is exactly 
zero. This can never be established experimentally, of course: the current 
experimental limit on the photon mass is that it is less than 1 x 10718 (Naka- 
mura et al. 2010). All gauge fields have spin 1 (in units of A). Ordinarily, a 
spin-1 particle would be expected to have three polarization states, according 
to quantum mechanics. However it is a general result that in the massless 
case the quanta have only two polarization states, both transverse to the di- 
rection of motion; the longitudinally polarized state is absent (this property, 
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TABLE 1.3 
Properties of SM gauge bosons. 


Particle Polarization Mass Width/Lifetime 
states 
(photon) 2 0 (theoretical) stable 
g (gluon) 2 0 (theoretical) stable 
w= 3 80.399 + 0.023 GeV Tw = 2.085 + 0.042 GeV 
Z? 3 91.187 + 0.0021 GeV Tz = 2.4952 + 0.0023 GeV 


familiar for the corresponding classical fields which are purely transverse, will 
be discussed in section 7.3.1). By contrast, all three polarization states are 
present for the massive gauge bosons. 

The photon and the gluon are stable particles. The W* and ZO particles 
decay with total widths of the order of 2 GeV (lifetimes ~ 0.3 x 107% s). 
Although this is significantly shorter than typical strong interaction decay 
lifetimes, these are of course weak decays, the rate being enhanced by the 
large energy release. 

Table 1.3 lists the properties of the SM gauge bosons; the masses and 
widths are taken from Nakamura et al. (2010). 
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1.4 Renormalization and the Higgs sector of the 
Standard Model 


1.4.1 Renormalization 


So far we have been discussing processes in which only one particle is ex- 
changed. These will generally be the terms of lowest order in a perturbative 
expansion in powers of the coupling strength. But we must clearly go beyond 
lowest order, and include the effects of multi-particle exchanges. We shall 
explain how to do this in chapter 10, for a simple scalar field theory. Such 
multi-particle exchange amplitudes are given by integrals over the momenta 
of the exchanged particles, constrained only by four-momentum conservation 
(no integral arises in the case of the exchange of a single particle, because its 
four-momentum is fixed in terms of the momenta of the scattering particles, 
as in section 1.2.3). Tt turns out that the integrals nearly always diverge as the 
momenta of the exchanged particles tend to infinity. Nevertheless, as we shall 
explain in chapter 10, this theory can be reformulated, by a process called 
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renormalization, in such a way that all multi-particle (higher-order) processes 
become finite and calculable — a quite remarkable fact, and one that is of 
course an absolutely crucial requirement in the case of the Standard Model 
interactions, where the relevant data are precise enough to test the accuracy 
of the theory well beyond lowest order, particularly in the case of QED (see 
chapter 11). The price to be paid for this taming of the divergences is just 
that the basic parameters of the theory, such as masses and coupling con- 
stants, have to be treated as parameters to be determined by comparison to 
the data, and cannot themselves be calculated. 

But some theories cannot be reformulated in this way — they are non- 
renormalizable. A simple test for whether a theory is renormalizable or not 
will be discussed in section 11.8: if the coupling constant has dimensions of 
a mass to an inverse power, the theory is non-renormalizable. An example of 
such a theory is the original four-Fermi theory of weak interactions, where the 
coupling constant Gp has the dimensions of an inverse square mass (or energy) 
as we saw in (1.31). We will look at this theory again in section 11.8, but the 
essential point for our purpose now is that the dimensionful coupling constant 
introduces an energy scale into the problem, namely Ge ~ 300 GeV. 
It seems reasonable to infer that a more relevant measure of the interaction 
strength will be given by the dimensionless number EGY a where E is a 
characteristic physical energy scale of any weak process under consideration 
— for example, the energy in the centre of momentum frame in a two-particle 
scattering process, at least at energies much greater than the particle masses. 
Then, for energies very much less than Gp"? the effective strength will be 
very weak, and the lowest order term in perturbation theory will work fine; 
this is how the Fermi theory was used, for many years. But as the energy 
increases, what happens is that more and more parameters have to be taken 
from experiment, in order to control the divergences; as the energy approaches 
Gy 1 = the theory becomes totally non-predictive and breaks down. Thus 
renormalizability is regarded as highly desirable in a theory. 

One might hope to come up with a renormalizable theory of weak interac- 
tions by replacing the four-fermion interaction by a Yukawa-like mechanism, 
with exchange of a quantum of mass M and dimensionless coupling y, say. 
Then just as in (1.32) we would identify Gp ~ y?/M? at low energies. How- 
ever, aS we have seen, phenomenology implies that the massive exchanged 
quantum must have spin 1. Unfortunately, this type of straightforward mas- 
sive spin-l theory is not renormalizable either, as we shall discuss in chapter 
22 (in volume 2). The trouble can be traced directly to the existence of the 
longitudinal polarization state which, as noted previously, is present for a 
massive spin-1 particle. If the exchanged spin-1 quantum were massless, as 
in QED, it would lack that third polarization state, and the theory would be 
renormalizable. But weak interaction facts dictate both non-zero mass and 
spin-1. 

In the case of QED, there is a symmetry principle behind both the zero 
mass of the photon and the absence of the longitudinal polarization state: 
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this symmetry is gauge invariance as we shall explain in section 7.3.1. It 
turns out that this symmetry is vital in rendering QED renormalizable. It is 
natural then to ask whether in the case of QED, a situation ever arises where 
the photon acquires mass, while retaining fully gauge-invariant interactions — 
and hence renormalizability (we would hope). If so, we would then have an 
analogue of what is needed for a renormalizable theory of weak interactions. 
The answer is that this can indeed happen, but it requires some extra dynamics 
to do it. Nature has actually provided us with a working model of what we 
want, in the phenomenon of superconductivity. There, the Meissner effect can 
be interpreted as implying that the photons propagating in a thin surface layer 
of the material have non-zero mass (see section 19.2). The dynamics behind 
this is subtle, and required many years of theoretical efforts before it was 
finally understood by Bardeen, Cooper and Schrieffer (1957). In simple terms, 
the mechanism is a two-step process. First, lattice interactions cause electrons 
to bind into pairs; then these pairs undergo Bose-Einstein condensation. This 
‘condensate’ is the BCS superconducting ground state. The essential point is 
that although the electromagnetic interactions are fully gauge invariant, the 
ground state is not. When a symmetry is broken by the ground state, it is 
said to be ‘spontaneously’ broken. We shall provide an introduction to the 
BCS ground state in chapter 17 of volume 2. 


The BCS theory is an example of spontaneous symmetry breaking oc- 
curring dynamically (through the particular lattice interactions). Many of 
the physically important phenomena can, however, be very satisfactorily de- 
scribed in terms of an effective theory, which treats only the electrodynamics 
of the condensate. Such a description was proposed by Ginzburg and Landau 
(1950), well before the BCS paper, in fact. 


How can this be applied in particle physics? Recall the idea, mentioned 
in section 1.3.1, that the analogue of the many-body ground state is the qft 
vacuum (Nambu 1961). In the Standard Model, the weak interactions are 
indeed described by a gauge-invariant theory, and the assumption is made 
that the vacuum breaks the gauge symmetry. The simplest way this idea 
can be implemented is along the lines of the Ginzburg-Landau theory, as 
suggested by Weinberg (1967) and by Salam (1968), and their proposal is em- 
bodied in the Glashow-Salam-Weinberg electroweak theory, which is part of 
the SM. It requires the introduction of four new spin-0 fields, which are called 
Higgs fields (Higgs 1964, Englert and Brout 1964, Guralnik et al. 1964), 
and which we may think of as playing the role of the BCS condensate (but 
not for electromagnetism, of course). The combined theory of quarks, lep- 
tons, electroweak gauge fields, and Higgs fields is gauge invariant, but one of 
the Higgs fields is supposed to have a non-zero average value in the physical 
vacuum, which breaks the gauge symmetry. The other three Higgs fields effec- 
tively become the longitudinal parts of the massive spin-1 W* and Z° fields, 
while the quantized excitations of the fourth Higgs field away from its vac- 
uum value appear physically as neutral spin-0 particles, called Higgs bosons 
(Higgs 1964). 
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Apart from giving mass to the W and ZO, the Higgs fields have more 
work to do. The electroweak gauge symmetry is exact only if all the fermion 
masses are zero; this is because it is a chiral symmetry (similar to, but not 
the same as, the chiral symmetry of QCD mentioned in section 1.2.2). Once 
again, this chiral gauge symmetry is essential to the renormalizability of the 
theory: if the fermion masses are incorporated in the usual way as parameters 
in the Lagrangian, the latter is no longer gauge invariant and the theory is 
non-renormalizable. In the SM, this problem is solved by having no fermion 
masses in the Lagrangian, and by postulating gauge-invariant Yukawa inter- 
actions between the fermions and the Higgs fields, which are arranged in such 
a way that, when the Higgs field gets a vacuum expectation value, the inter- 
action terms yield just the fermion masses. So again, the symmetry breaking 
is economically blamed on the same property of the vacuum. When the Higgs 
field oscillates away from its vacuum value, the result will be residual in- 
teractions between the fermions and the Higgs boson, which will have the 
defining characteristic that each fermion will interact with the Higgs boson 
with a strength proportional to its (i.e. the fermion’s) mass. This is clearly a 
testable prediction, once the Higgs boson is found. 

We have emphasized the role that the Higgs fields play in the renormaliz- 
ability of the GSW theory. The all-important proof of that renormalizability 
was given by ’t Hooft (1971b), and he also proved the renormalizability of 
QCD (1971a); see also ’t Hooft and Veltman (1972). 

The SM Higgs sector is the simplest one that will do the job; more compli- 
cated versions are possible. Perhaps the Higgs field is a composite formed in 
some new heavy fermion-antifermion dynamics, reminiscent of BCS pairing. 
In any case, the SM Higgs sector is there to be tested experimentally. In the 
following section we shall discuss briefly what is presently known about the 
SM Higgs boson, postponing a fuller discussion until we present the GSW 
theory in chapter 22 in volume 2. 

Before ending this section we must note that modern renormalization the- 
ory is concerned with more than perturbative calculability. The renormaliza- 
tion group and related ideas provide powerful tools for ‘improving’ perturba- 
tion theory, by systematically resumming terms which (in the particle physics 
case) dominate at short distances. Prominent among the results of this analy- 
sis (see chapters 15 and 16) are the concepts of energy-dependent (‘running’) 
masses and coupling strengths, and the calculation of QCD corrections to 
parton-model predictions. 


1.4.2 The Higgs boson of the Standard Model 


According to the SM, just one neutral spin-0 Higgs boson is expected; its 
mass my is not predicted by the theory. The experimental discovery of the 
SM Higgs boson has been a major goal of several generations of accelerators: 
the LEP ete” collider at Cern, the Tevatron pp collider at Fermilab, and 
most recently the LHC pp collider at Cern. Experimentally, bounds on the 
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Higgs mass can be obtained directly, through searching for its production and 
subsequent decay; non-observation will lead to a lower bound for my. There 
are also indirect constraints, coming from fits to precision measurements of 
electroweak observables. The latter are sensitive to higher order corrections 
which involve the Higgs boson as a virtual particle; these depend logarithmi- 
cally on the unknown parameter my and give upper bounds on my, assuming, 
of course, that the SM is correct. 
A lower bound 


my > 114.4 GeV (95% C.L.) (1.34) 


was set at LEP (LEP 2003) by combining data on direct searches. Combining 
this with a global fit to precision electroweak data, an upper bound 


my < 186 GeV (95% CL.) (1.35) 


was obtained (Nakamura et al. 2010). 

By early 2012, the combined results of the CDF and DO experiments at 
the Tevatron, and the ATLAS and CMS experiments at the LHC, excluded an 
my value in the interval (approximately) 130 GeV to 600 GeV, at 95 % C.L. 
Finally, in July 2012 the ATLAS (Aad et al. 2012) and CMS (Chatrchyan et 
al. 2012) collaborations announced the discovery, with a significance of 50, 
of a neutral boson with a mass in the range 125-126 GeV, its production and 
decay rates being broadly compatible with the predictions for the SM Higgs 
boson. The existence of the measured decay to two photons implies that the 
particle is a boson with spin different from 1 (Landau 1948, Yang 1950), but 
spin-0 has not yet been confirmed. Nevertheless, it is probable that this is the 
(or perhaps a) Higgs boson. Its long-anticipated discovery opens a new era 
in particle physics: the experimental exploration of the symmetry-breaking 
sector of the SM. 


E Áo_ ooo ——— ooo o ————— 


1.5 Summary 


The Standard Model provides a relatively simple picture of quarks and leptons 
and their non-gravitational interactions. The quark colour triplets are the 
basic source particles of the gluon fields in QCD, and they bind together to 
make hadrons. The weak interactions involve quark and lepton doublets — for 
instance the quark doublet (u, d) and the lepton doublet (ve,e7) of the first 
generation. These are sources for the WF and ZO fields. Charged fermions 
(quarks and leptons) are sources for the photon field. All the mediating force 
quanta have spin-1. The weak and strong force fields are generalizations of 
electromagnetism; all three are examples of gauge theories, but realized in 
subtly different ways. 
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In the following chapters our aim will be to lead the reader through the 
mathematical formalism involved in giving precise quantitative form to what 
we have so far described only qualitatively and to provide physical interpre- 
tation where appropriate. In the remainder of part I of the present volume, 
we first show how Schródinger's quantum mechanics and Maxwell’s electro- 
magnetic theory may be combined as a gauge theory — in fact the simplest 
example of such a theory. We then introduce relativistic quantum mechanics 
for spin-0 and spin-4 particles, and include electromagnetism via the gauge 
principle. Lorentz transformations and discrete symmetries are also covered. 
In part II, we develop the formalism of quantum field theory, beginning with 
scalar fields and moving on to QED; this is then applied to many simple (‘tree 
level’) QED processes in part III. In the final part IV, we present an intro- 
duction to renormalization at the one-loop level, including renormalization 
of QED. The more complicated gauge theories of QCD and the electroweak 
theory are reserved for volume 2. 


=== a 


Problems 


1.1 Evaluate the integral in (1.26) directly. [Hint: Use spherical polar coordi- 
nates with the polar axis along the direction of q, so that d°r = r?dr sin 0 dé dg, 
and exp(ig : r) = exp(ilq|rcos0). Make the change of variable x = cos @, and 
do the ¢ integral (trivial) and the x integral. Finally do the r integral.] 


1.2 Using the concept of strangeness conservation in strong interactions, ex- 
plain why the threshold energy (for r” incident on stationary protons) for 


m +p > K’ + anything 


is less than for 


m +p —> K’ + anything 
assuming both processes proceed through the strong interaction. 


1.3 Note: the invariant square p? of a 4-momentum p = (E, p) is defined as 
p? = E? — p?. We remind the reader that h = c = 1 (see Appendix B). 


(a) An electron of 4-momentum k scatters from a stationary proton 
of mass M via a one-photon exchange process, producing a final 
hadronic state of 4-momentum p’, the final electron 4-momentum 
being k’. Show that 


p? =@+2M(E — E) + M? 


where q? = (k — k’)?, and E, E’ are the initial and final electron 
energies in this frame (i.e. the one in which the target proton is 
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at rest). Show that if the electrons are highly relativistic then 
q? = —4EE' sin? 0/2, where 0 is the scattering angle in this frame. 
Deduce that for elastic scattering E” and 6 are related by 


fi 2E . 2 
E = 8 | (1+ sio 0/2). 


(b) Electrons of energy 4.879 GeV scatter elastically from protons, with 
0 = 10°. What is the observed value of E'? 


(c) In the scattering of these electrons, at 10°, it is found that there is 
a peak of events at E’ = 4.2 GeV; what is the invariant mass of the 
produced hadronic state (in MeV)? 


(d) Calculate the value of E” at which the ‘quasi-elastic peak’ will be 
observed, when electrons of energy 400 MeV scatter at an angle 
0 = 45° from a He nucleus, assuming that the struck nucleon is at 
rest inside the nucleus. Estimate the broadening of this final peak 
caused by the fact that the struck nucleon has, in fact, a momentum 
distribution by virtue of being localized within the nuclear size. 


1.4 


(a) In a simple non-relativistic model of a hydrogen-like atom, the en- 
ergy levels are given by 


where Z is the nuclear charge and u is the reduced mass of the 
electron and nucleus. Calculate the splitting in eV between the 
n = land n = 2 states in positronium, which is an ete” bound 
state, assuming this model holds. 


(b) In this model, the ete” potential is the simple Coulomb one 


Areor 


Suppose that the potential between a heavy quark Q and an anti- 


quark Q was 
Qs 


r 
where a; is a ‘strong fine structure constant’. Calculate values of 
as (different in (i) and (ii)) corresponding to the information (the 
quark masses are phenomenological ‘quark model’ masses) 


(i) the splitting between the n = 2 and n = 1 states in charmonium 
(ct) is 588 MeV, and me = 1870 MeV; 
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(ii) the splitting between the n = 2 and n = 1 states in the upsilon 
series (bb) is 563 MeV, and my = 5280 MeV. 


(c) In positronium, the n = 13S; and n = 1 So states are split by the 
hyperfine interaction, which has the form Romeo. -02 where me 
is the electron mass and 01,02 are the spin matrices for the e” 
and e” respectively. Calculate the expectation value of o. - 02 in 
the 3S; and Sy states, and hence evaluate the splitting between 
these levels (calculated in lowest order perturbation theory) in eV. 
[Hint: the total spin S is given by S = 1(01 +02). So S? = 
1/72 


3(07+03+201 -07). Hence the eigenvalues of o1 - 02 are directly 


related to those of 8°] 

(d) Suppose an analogous ‘strong’ hyperfine interaction existed in the ce 
system, and was responsible for the splitting between the n = 1 3S4 
and n = 11So states, which is 116 MeV experimentally (i.e. replace 
a by as and me by me = 1870 MeV). Calculate the corresponding 
value of as. 


1.5 The potential between a heavy quark Q and an antiquark Q is found 
empirically to be well represented by 


V(r) = Sd 
r 


where a, = 0.5 and b = 0.18 GeV?. Indicate the origin of the first term in 
V(r), and the significance of the second. 

An estimate of the ground-state energy of the bound QQ system may be 
made as follows. For a given r, the total energy is 


2 
Er) =m-“S+tr+2 
r m 


where m is the mass of the Q (or Q) and p is its momentum (assumed non- 
relativistic). Explain why p may be roughly approximated by 1/r, and sketch 
the resulting E(r) as a function of r. Hence show that, in this approximation, 
the radius of the ground state, ro, is given by the solution of 


2 a 
= 4p, 
mro Tí 


Taking m = 1.5 GeV as appropriate to the cc system, verify that for this 
system 
(1/ro) ~ 0.67 GeV 


and calculate the energy of the cc ground state in GeV, according to this 
model. 

An excited cc state at 3.686 GeV has a total width of 278 keV, and one 
at 3.77 GeV has a total width of 24 MeV. Comment on the values of these 
widths. 
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1.6 The Hamiltonian for a two-state system using the normalized base states 
|1),|2) has the form 


(EI) (11412 Y — ( —acos20 asin26 
(2|H|1) (2|H|2) J — asin 20  acos26 


where a is real and positive. Find the energy eigenvalues Ey and E_, and 
express the corresponding normalized eigenstates |+) and |—) in terms of |1) 
and |2). 

At time t = 0 the system is in state |1). Show that the probability that it 
will be found to be in state |2) at a later time t is 


sin? 20 sin? (at). 


Discuss how a formalism of this kind can be used in the context of neutrino 
oscillations. How might the existence of neutrino oscillations explain the solar 
neutrino problem? (This will be discussed in chapter 21 of volume 2.) 


1.7 In an interesting speculation, it has been suggested (Arkani-Hamad et al. 
1998, 1999, Antoniadis et al. 1998) that the weakness of gravity as observed in 
our (apparently) three-dimensional world could be due to the fact that gravity 
actually extends into additional ‘compactified’ dimensions (that is, dimensions 
which have the geometry of a circle, rather than of an infinite line). For the 
particles and forces of the Standard Model, however, such leakage into extra 
dimensions has to be confined to currently probed distances, which are of 
order My- 


(a) Consider Newtonian gravity in (3 + d) spatial dimensions. Explain 
why you would expect that the gravitational potential will have the 


form ü 
m31m 
ya (1.36) 


[Think about how the ‘1/r?’ fall-off of the force is related to the 
surface area of a sphere in the case d = 0. Note that the formula 
works for d = —2! What happens in the case d = —1?] 


(b) Show that Gx 344 has dimensions (mass)~?+®, This allows us to 
introduce the ‘true’ Planck scale — i.e. the one for the underlying 
theory in 3+ d spatial dimensions — as Gn 344 = (Mp 340) +0, 


(c) Now suppose that the form (1.36) only holds when the distance r 
between the masses is much smaller R, the size of the compactified 
dimensions. If the masses are placed at distances r > R, their 
gravitational flux cannot continue to penetrate into the extra di- 
mensions, and the potential (1.36) should reduce to the familiar 
three-dimensional one; so we must have 


_ mim2Gn,3+a 1 


ae (1.37) 


Vx 34a(r > R) = 
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Show that this implies that 


M2 = M2 a ¿(RM 370)". (1.38) 


(d) Suppose that d = 2 and R ~ 1 mm: what would Mp 344 be, in TeV? 
Suggest ways in which this theory might be tested experimentally. 
Taking Mp 3+a ~ 1 TeV, explore other possibilities for d and R. 


Taylor & Francis 
Taylor & Francis Group 
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Electromagnetism as a Gauge Theory 


2.1 Introduction 


The previous chapter introduced the basic ideas of the Standard Model of 
particle physics, in which quarks and leptons interact via the exchange of 
gauge field quanta. We must now look more closely into what is the main 
concern of this book — namely, the particular nature of these ‘gauge 
theories’. 

One of the relevant forces — electromagnetism — has been well understood in 
its classical guise for many years. Over a century ago, Faraday, Maxwell and 
others developed the theory of electromagnetic interactions, culminating in 
Maxwell’s paper of 1864 (Maxwell 1864). Today Maxwell’s theory still stands 
— unlike Newton’s ‘classical mechanics’ which was shown by Einstein to require 
modifications at relativistic speeds, approaching the speed of light. Moreover, 
Maxwell’s electromagnetism, when suitably married with quantum mechanics, 
gives us ‘quantum electrodynamics’ or QED. We shall see in chapter 10 that 
this theory is in truly remarkable agreement with experiment. As we have 
already indicated, the theories of the weak and strong forces included in the 
Standard Model are generalizations of QED, and promise to be as successful 
as that theory. The simplest of the three, QED, is therefore our paradigmatic 
theory. 

From today’s perspective, the crucial thing about electromagnetism is that 
it is a theory in which the dynamics (i.e. the behaviour of the forces) is 
intimately related to a symmetry principle. In the everyday world, a symmetry 
operation is something that can be done to an object that leaves the object 
looking the same after the operation as before. By extension, we may consider 
mathematical operations — or ‘transformations’ — applied to the objects in our 
theory such that the physical laws look the same after the operations as they 
did before. Such transformations are usually called invariances of the laws. 
Familiar examples are, for instance, the translation and rotation invariance 
of all fundamental laws: Newton’s laws of motion remain valid whether or 
not we translate or rotate a system of interacting particles. But of course — 
precisely because they do apply to all laws, classical or quantum — these two 
invariances have no special connection with any particular force law. Instead, 
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they constrain the form of the allowed laws to a considerable extent, but by 
no means uniquely determine them. Nevertheless, this line of argument leads 
one to speculate whether it might in fact be possible to impose further types 
of symmetry constraints so that the forms of the force laws are essentially 
determined. This would then be one possible answer to the question: why are 
the force laws the way they are? (Ultimately of course this only replaces one 
question by another!) 


In this chapter we shall discuss electromagnetism from this point of view. 
This is not the historical route to the theory, but it is the one which generalizes 
to the other two interactions. This is why we believe it important to present 
the central ideas of this approach in the familiar context of electromagnetism 
at this early stage. 

A distinction that is vital to the understanding of all these interactions 
is that between a global invariance and a local invariance. In a global in- 
variance the same transformation is carried out at all space-time points: it 
has an ‘everywhere simultaneously’ character. In a local invariance different 
transformations are carried out at different individual space-time points. In 
general, as we shall see, a theory that is globally invariant will not be invari- 
ant under locally varying transformations. However, by introducing new force 
fields that interact with the original particles in the theory in a specific way, 
and which also transform in a particular way under the local transformations, 
a sort of local invariance can be restored. We will see all these things more 
clearly when we go into more detail, but the important conceptual point to be 
grasped is this: one may view these special force fields and their interactions 
as existing in order to permit certain local invariances to be true. The par- 
ticular local invariance relevant to electromagnetism is the well-known gauge 
invariance of Maxwell’s equations: in the quantum form of the theory this 
property is directly related to an invariance under local phase transformations 
of the quantum fields. A generalized form of this phase invariance also under- 
lies the theories of the weak and strong interactions. For this reason they are 
all known as ‘gauge theories’. 

A full understanding of gauge invariance in electrodynamics can only be 
reached via the formalism of quantum field theory, which is not easy to mas- 
ter — and the theory of quantum gauge fields is particularly tricky, as we 
shall see in chapter 7. Nevertheless, many of the crucial ideas can be per- 
fectly adequately discussed within the more familiar framework of ordinary 
quantum mechanics, rather than quantum field theory, treating electromag- 
netism as a purely classical field. This is the programme followed in the rest 
of part I of this volume. In the present chapter we shall discuss these ideas in 
the context of non-relativistic quantum mechanics; in the following two chap- 
ters, we shall explore the generalization to relativistic quantum mechanics, 
for particles of spin-0 (via the Klein-Gordon equation) and spin- (via the 
Dirac equation). While containing substantial physics in their own right, these 
chapters constitute essential groundwork for the quantum field treatment in 
parts I-IV. 


2.2. The Maxwell equations: current conservation 43 


e a 
2.2 The Maxwell equations: current conservation 


Question: Would you distinguish local conservation laws from global con- 
servation laws. 

Feynman: If a cat were to disappear in Pasadena and at the same time 
appear in Erice, that would be an example of global conservation of cats. 
This is not the way cats are conserved. Cats or charge or baryons are 
conserved in a much more continuous way. If any of these quantities be- 
gin to disappear in a region, then they begin to appear in a neighbouring 
region. Consequently, we can identify the flow of charge out of a region 
with the disappearance of charge inside the region. This identification of 
the divergence of a flux with the time rate of change of a charge density is 
called a local conservation law. A local conservation law implies that the 
total charge is conserved globally, but the reverse does not hold. However, 
relativistically it is clear that non-local global conservation laws cannot 
exist, since to a moving observer the cat will appear in Erice before it 
disappears in Pasadena. 


—From the question-and-answer session following a lecture by R. P. Feyn- 
man at the 1964 International School of Physics ‘Ettore Majorana’ (Feyn- 
man 1965b). 


We begin by considering the basic laws of classical electromagnetism, the 
Maxwell equations. We use a system of units (Heaviside-Lorentz) which is 
convenient in particle physics (see appendix C). Before Maxwell's work these 
laws were 


V.E = Pea (Gauss’ law) (2.1) 
OB 

VxE = ae (Faraday—Lenz laws) (2.2) 

V-B = 0 (no magnetic charges) (2.3) 


and, for steady currents, 
VX B= jem (Ampère’s law). (2.4) 


Here pem is the charge density and Jem is the current density; these densities 
act as ‘sources’ for the E and B fields. Maxwell noticed that taking the 
divergence of this last equation leads to conflict with the continuity equation 
for electric charge 


Opem : 
E a =0. 2. 
Ee +V-jJem =0 (2.5) 
Since 
V-(V x B)=0 (2.6) 


from (2.4) there follows the result 


Vg =0. (2.7) 
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This can only be true in situations where the charge density is constant in 
time. For the general case, Maxwell modified Ampére’s law to read 


OE 


VxB=jem+ "3 


(2.8) 
which is now consistent with (2.5). Equations (2.1)-(2.3), together with (2.8), 
constitute Maxwell's equations in free space (apart from the sources). 

It is worth spending a moment on the vitally important continuity equation 
(2.5) — note the Feynman quotation at the start of this section. Let us integrate 
this equation over any arbitrary volume (2, and write the result as 


0 


0 A ae} Oh dV. 2.9 
5 | Pema = — | V-demă (2.9) 


Equation (2.9) states that the rate of decrease of charge in any arbitrary 
volume 2 is due precisely and only to the flux of current out of its surface; 
that is, no net charge can be created or destroyed in Q. Since Q can be 
made as small as we please, this means that electric charge must be locally 
conserved: a process in which charge is created at one point and destroyed at a 
distant one is not allowed, despite the fact that it conserves the charge overall 
or ‘globally’. The ultimate reason for this is that the global form of charge 
conservation would necessitate the instantaneous propagation of signals (such 
as ‘now, create a positron over there’), and this conflicts with special relativity 
—a theory which, historically, flowered from the soil of electrodynamics. The 
extra term introduced by Maxwell — the ‘electric displacement current’ — owes 
its place in the dynamical equations to a local conservation requirement. 

We remark at this point that we have just introduced another local/global 
distinction, similar to that discussed earlier in connection with invariances. In 
this case the distinction applies to a conservation law, but since invariances 
are related to conservation laws in both classical and quantum mechanics, we 
should perhaps not be too surprised by this. However, as with invariances, 
conservation laws — such as charge conservation in electromagnetism — play a 
central role in gauge theories in that they are closely related to the dynamics. 
The point is simply illustrated by asking how we could measure the charge 
of a newly created subatomic particle X. There are two conceptually different 
ways: 


(i) We could arrange for X to be created in a reaction such as 
A+B>C+D+4+xX 


where the charges of A, B, C and D are already known. In this case 
we can use charge conservation to determine the charge of X. 


(ii) We could see how particle X responded to known electromagnetic 
fields. This uses dynamics to determine the charge of X. 
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Either way gives the same answer: it is the conserved charge which deter- 
mines the particle’s response to the field. By contrast, there are several other 
conservation laws that seem to hold in particle physics, such as lepton number 
and baryon number, that apparently have no dynamical counterpart (cf the 
remarks at the end of section 1.3.6). To determine the baryon number of a 
newly produced particle, we have to use B conservation and tot up the total 
baryon number on either side of the reaction. As far as we know there is no 
baryonic force field. 

Thus gauge theories are characterized by a close interrelation between three 
conceptual elements: symmetries, conservation laws and dynamics. In fact, 
it is now widely believed that the only exact quantum number conservation 
laws are those which have an associated gauge theory force field — see com- 
ment (i) in section 2.6. Thus one might suspect that baryon number is not 
absolutely conserved — as is indeed the case in proposed unified gauge theo- 
ries of the strong, weak and electromagnetic interactions. In this discussion 
we have briefly touched on the connection between two pairs of these three 
elements: symmetries ++ dynamics; and conservation laws + dynamics. The 
precise way in which the remaining link is made — between the symmetry 
of electromagnetic gauge invariance and the conservation law of charge — is 
more technical. We will discuss this connection with the help of simple ideas 
from quantum field theory in chapter 7, section 7.4. For the present we con- 
tinue with our study of the Maxwell equations and, in particular, of the gauge 
invariance they exhibit. 
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2.3 The Maxwell equations: Lorentz covariance and gauge 
invariance 
In classical electromagnetism, and especially in quantum mechanics, it is con- 


venient to introduce the vector potential A,,(x) in place of the fields E and 
B. We write: 


B=VxA (2.10) 
OA 


which defines the 3-vector potential A and the scalar potential V. With these 
definitions, equations (2.2) and (2.3) are then automatically satisfied. 

The origin of gauge invariance in classical electromagnetism lies in the 
fact that the potentials A and V are not unique for given physical fields Æ 
and B. The transformations that A and V may undergo while preserving 
E and B (and hence the Maxwell equations) unchanged are called gauge 
transformations, and the associated invariance of the Maxwell equations is 
called gauge invariance. 
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What are these transformations? Clearly A can be changed by 
A>A=A+Vx (2.12) 


where x is an arbitrary function, with no change in B since V x Vf = 0, for 
any scalar function f. To preserve E, V must then change simultaneously by 


yo, 2.1 
Vay aves (2.13) 


These transformations can be combined into a single compact equation by 
introducing the 4-vector potential!: 


Ah = (V, A) (2.14) 


and noting (from problem 2.1) that the differential operators (9/0t, —V) form 
the components of a 4-vector operator 0”. A gauge transformation is then 


specified by 
Al — Ah = Ah — Oly, (2.15) 


The Maxwell equations can also be written in a manifestly Lorentz covariant 
form (see appendix D) using the 4-current j4, given by 


Jém = (Pem Jem) (2.16) 
in terms of which the continuity equation takes the form (problem 2.1): 
Duh = 0. (2.17) 
The Maxwell equations (2.1) and (2.8) then become (problem 2.2): 
OLE" = Jom (2.18) 
where we have defined the field strength tensor: 
ph = OF AY — OV Ar. (2.19) 
Under the gauge transformation 
AY > Ah = Ah — Oy (2.20) 
PF!” remains unchanged: 
Fey œ IBY = pur (2.21) 
so F*” is gauge invariant and so, therefore, are the Maxwell equations in 


1See appendix D for relativistic notation and for an explanation of the very important 
concept of covariance, which we are about to invoke in the context of Lorentz transforma- 
tions, and will use again in the next section in the context of gauge transformations; we 
shall also use it in other contexts in later chapters. 
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the form (2.18). The ‘Lorentz-covariant and gauge-invariant field equations’ 
satisfied by A” then follow from equations (2.18) and (2.19): 


AY — 8” (Ə A") = ju. (2.22) 


Since gauge transformations turn out to be of central importance in the 
quantum theory of electromagnetism, it would be nice to have some insight 
into why Maxwell’s equations are gauge invariant. The all-important ‘fourth’ 
equation (2.8) was inferred by Maxwell from local charge conservation, as 
expressed by the continuity equation 


The field equation 
OLE?” = je (2.24) 


em 
then of course automatically embodies (2.23). The mathematical reason it 
does so is that F”” is a four-dimensional kind of ‘curl’ 


Fu = Oh Av — pr Ab (2.25) 
which (as we have seen in (2.21)) is unchanged by a gauge transformation 
A" AP = A" — O. (2.26) 


Hence there is the suggestion that the gauge invariance is related in some way 
to charge conservation. However, the connection is not so simple. Wigner 
(1949) has given a simple argument to show that the principle that no phys- 
ical quantity can depend on the absolute value of the electrostatic poten- 
tial, when combined with energy conservation, implies the conservation of 
charge. Wigner’s argument relates charge (and energy) conservation to an 
invariance under transformation of the electrostatic potential by a constant: 
charge conservation alone does not seem to require the more general space- 
time-dependent transformation of gauge invariance. 

Changing the value of the electrostatic potential by a constant amount is 
an example of what we have called a global transformation (since the change 
in the potential is the same everywhere). Invariance under this global trans- 
formation is related to a conservation law: that of charge. But this global 
invariance is not sufficient to generate the full Maxwellian dynamics. How- 
ever, as remarked by ’t Hooft (1980), one can regard equations (2.12) and 
(2.13) as expressing the fact that the local change in the electrostatic poten- 
tial V (the 0x/0t term in (2.13)) can be compensated — in the sense of leaving 
the Maxwell equations unchanged — by a corresponding local change in the 
magnetic vector potential A. Thus by including magnetic effects, the global 
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invariance under a change of V by a constant can be extended to a local in- 
variance (which is a much more restrictive condition to satisfy). Hence there 
is a beginning of a suggestion that one might almost ‘derive’ the complete 
Maxwell equations, which unify electricity and magnetism, from the require- 
ment that the theory be expressed in terms of potentials in such a way as 
to be invariant under local (gauge) transformations on those potentials. Cer- 
tainly special relativity must play a role too: this also links electricity and 
magnetism, via the magnetic effects of charges as seen by an observer moving 
relative to them. If a 4-vector potential A” is postulated, and it is then de- 
manded that the theory involve it only in a way which is insensitive to local 
changes of the form (2.15), one is led naturally to the idea that the phys- 
ical fields enter only via the quantity F*”, which is invariant under (2.15). 
From this, one might conjecture the field equation on grounds of Lorentz 
covariance. 

It goes without saying that this is certainly not a ‘proof’ or ‘derivation’ of 
the Maxwell equations. Nevertheless, the idea that dynamics (in this case, the 
complete interconnection of electric and magnetic effects) may be intimately 
related to a local invariance requirement (in this case, electromagnetic gauge 
invariance) turns out to be a fruitful one. As indicated in section 2.1, it is 
generally the case that, when a certain global invariance is generalized to a 
local one, the existence of a new ‘compensating’ field is entailed, interacting in 
a specified way. The first example of dynamical theory ‘derived’ from a local 
invariance requirement seems to be the theory of Yang and Mills (1954) (see 
also Shaw 1955). Their work was extended by Utiyama (1956), who developed 
a general formalism for such compensating fields. As we have said, these types 
of dynamical theories, based on local invariance principles, are called gauge 
theories. 

It is a remarkable fact that the interactions in the Standard Model of par- 
ticle physics are of precisely this type. We have briefly discussed the Maxwell 
equations in this light, and we will continue with (quantum) electrodynam- 
ics in the following two sections. The two other fundamental interactions 
— the strong interaction between quarks and the weak interaction between 
quarks and leptons — also seem to be described by gauge theories (of essen- 
tially the Yang-Mills type), as we shall see in detail in the second volume of 
this book. A fourth example, but one which we shall not pursue in this book, 
is that of general relativity (the theory of gravitational interactions). Utiyama 
(1956) showed that this theory could be arrived at by generalizing the global 
(space-time independent) coordinate transformations of special relativity to 
local ones; as with electromagnetism, the more restrictive local invariance 
requirements entailed the existence of a new field — the gravitational one — 
with an (almost) prescribed form of interaction. Unfortunately, despite this 
‘gauge’ property, no consistent quantum field theory of general relativity is 
known. 

In order to proceed further, we must now discuss how such (gauge) ideas 
are incorporated into quantum mechanics. 
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2.4 Gauge invariance (and covariance) in quantum 
mechanics 


The Lorentz force law for a non-relativistic particle of charge q moving with 
velocity v under the influence of both electric and magnetic fields is 


F=qE +qv x B. (2.27) 


It may be derived, via Hamilton’s equations, from the classical Hamiltonian? 
Bai A)? +qV: (2.28) 
apa p-—d qv. . 


The Schrödinger equation for such a particle in an electromagnetic field is 


1 9 t 
—(-iV — qA}? +qV iaga 0 (2.29) 
2m Ot 

which is obtained from the classical Hamiltonian by the usual prescription, 
p > —iV, for Schródinger's wave mechanics (îi = 1). Note the appearance of 
the operator combinations 


D=V-igA 
(2.30) 


D? =09/0t+iqV 


in place of V and 0/0t, in going from the free-particle Schrödinger equation 
to the electromagnetic field case. 

The solution y(x,t) of the Schrödinger equation (2.29) describes com- 
pletely the state of the particle moving under the influence of the potentials 
V, A. However, these potentials are not unique, as we have already seen: 
they can be changed by a gauge transformation 


A>A' = A+Vx (2.31) 
V>V = V-09x/0t (2.32) 


and the Maxwell equations for the fields E and B will remain the same. 
This immediately raises a serious question: if we carry out such a change 
of potentials in equation (2.29), will the solution of the resulting equation 
describe the same physics as the solution of equation (2.29)? Tf it does, 
we shall be able to assume the validity of Maxwell's theory for the quan- 
tum world; if not, some modification will be necessary, since the gauge sym- 
metry possessed by the Maxwell equations will be violated in the quantum 
theory. 


2We set h = c = 1 throughout (see appendix B). 
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The answer to the question just posed is evidently negative, since it is 
clear that the same “y” cannot possibly satisfy both (2.29) and the analogous 
equation with (V, A) replaced by (V’, A’). Unlike Maxwell’s equations, the 
Schrodinger equation is not gauge invariant. But we must remember that the 
wavefunction y is not a directly observable quantity, as the electromagnetic 
fields E and B are. Perhaps w does not need to remain unchanged (invari- 
ant) when the potentials are changed by a gauge transformation. In fact, 
in order to have any chance of ‘describing the same physics’ in terms of the 
gauge-transformed potentials, we will have to allow w to change as well. This 
is a crucial point: for quantum mechanics to be consistent with Maxwell’s 
equations it is necessary for the gauge transformations (2.31) and (2.32) of 
the Maxwell potentials to be accompanied also by a transformation of the 
quantum-mechanical wavefunction, Y — Y”, where y” satisfies the equation 

/ 
(iv —qA Y + av) Y (x,t) = ca (2.33) 
2m Ot 

Note that the form of (2.33) is exactly the same as the form of (2.29) — it is 
this that will effectively ensure that both ‘describe the same physics’. Readers 
of appendix D will expect to be told that — if we can find such a y” — we may 
then assert that (2.29) is gauge covariant, meaning that it maintains the same 
form under a gauge transformation. (The transformations relevant to this use 
of ‘covariance’ are gauge transformations.) 

Since we know the relations (2.31) and (2.32) between A, V and A’, V’, 
we can actually find what W'(a,t) must be in order that equation (2.33) be 
consistent with (2.29). We shall state the answer and then verify it; then we 
shall discuss the physical interpretation. The required w(x, t) is 


Y (x,t) = expliqx (a, t)JW(x, t) (2.34) 


where x is the same space-time-dependent function as appears in equations 
(2.31) and (2.32). To verify this we consider 


[-iV — qA — q(Vx)][exp(igx)¥] 
= q(Vx)exp(igx) + exp(igx) - (-iVw) 
+ exp(iqx) -(—gAv) — (Vx) expligx)y. (2.35) 


The first and the last terms cancel leaving the result: 


(—iV — A)” 


(—iV — qA’)w' = exp(iqx) : (AV — qA)b (2.36) 
which may be written using equation (2.30) as: 
(iD'y”) = expligx) - (—¡Dy). (2.37) 


Thus, although the space-time-dependent phase factor feels the action of the 
gradient operator V, it ‘passes through’ the combined operator D’ and con- 
verts it into D: in fact comparing the equations (2.34) and (2.37), we see that 
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D'y” bears to Dw exactly the same relation as 4” bears to Y. In just the 
same way we find (cf equation (2.30)) 


(Dap) = exp(igx) - GD°v) (2.38) 


where we have used equation (2.32) for V’. Once again, DY is simply related 
to DO. Repeating the operation which led to equation (2.37) we find 


Loo ; Lai 
LCD = expli) 5 —(-iD)*v 
m 2m 
= expliqx) -iD°w (using equation (2.29)) 


= iD%y (using equation (2.30)). (2.39) 


Equation (2.39) is just (2.33) written in the D notation of equation (2.30), 
so we have verified that (2.34) is the correct relationship between y” and 
w to ensure consistency between equations (2.29) and (2.33). Precisely this 
consistency is summarized by the statement that (2.29) is gauge covariant. 

Do w and y” describe the same physics, in fact? The answer is yes, but it 
is not quite trivial. It is certainly obvious that the probability densities ||? 
and ||? are equal, since in fact ùy and 4” in equation (2.34) are related by 
a phase transformation. However, we can be interested in other observables 
involving the derivative operators V or 0/0t — for example, the current, which 
is essentially ~*(Vw) — (Vw)*w. It is easy to check that this current is 
not invariant under (2.34), because the phase y(a,t) is a-dependent. But 
equations (2.37) and (2.38) show us what we must do to construct gauge- 
invariant currents: namely, we must replace V by D (and in general also 
0/0t by DO) since then: 


Y” (D'y) = y* exp(—igx) - expligx) - (DY) = Y*Dy (2.40) 


for example. Thus the identity of the physics described by 4 and y” is indeed 
ensured. Note, incidentally, that the equality between the first and last terms 
in (2.40) is indeed a statement of (gauge) invariance. 

We summarize these important considerations by the statement that the 
gauge invariance of Maxwell equations re-emerges as a covariance in quantum 
mechanics provided we make the combined transformation 


A+ A =A+Vx 
V > V' =V - 09x/0t (2.41) 


Y — y = expligx)y 


on the potential and on the wavefunction. 

The Schrödinger equation is non-relativistic, but the Maxwell equations are 
of course fully relativistic. One might therefore suspect that the prescriptions 
discovered here are actually true relativistically as well, and this is indeed 
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the case. We shall introduce the spin-0 and spin-3 relativistic equations in 
chapter 3. For the present we note that (2.30) can be written in manifestly 


Lorentz covariant form as 
DY = OF + iq AH (2.42) 


in terms of which (2.37) and (2.38) become 
—iD'"y' = expliqx) - (—i Dry). (2.43) 


It follows that any equation involving the operator 0“ can be made gauge 
invariant under the combined transformation 


AM AH = Al O 
Y => p= exp(iqx)V 


if OM is replaced by D*. In fact, we seem to have a very simple prescription 
for obtaining the wave equation for a particle in the presence of an electro- 
magnetic field from the corresponding free particle wave equation: make the 


replacement 
OH > Dr = OF + igA". (2.44) 


In the following section this will be seen to be the basis of the so-called ‘gauge 
principle’ whereby, in accordance with the idea advanced in the previous sec- 
tions, the form of the interaction is determined by the insistence on (local) 
gauge invariance. 

One final remark: this new kind of derivative 


D" = Ə! + 194" (2.45) 


turns out to be of fundamental importance — it will be the operator which 
generalizes from the (Abelian) phase symmetry of QED (see comment (iii) 
of section 2.6) to the (non-Abelian) phase symmetry of our weak and strong 
interaction theories. It is called the ‘gauge covariant derivative’, the term 
being usually shortened to ‘covariant derivative’ in the present context. The 
geometrical significance of this term will be explained in volume 2. 


ac a 


2.5 The argument reversed: the gauge principle 


In the preceding section, we took it as known that the Schrodinger equation, 
for example, for a charged particle in an electromagnetic field, has the form 


1 A 2 _: 
2 AV — 4A)? + qV | Y = idy/at. (2.46) 
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We then checked its gauge invariance under the combined transformation 
ASA = A+Vx 
V>V = V-0x/0t (2.47) 
voy = expligx)y. 


We now want to reverse the argument: we shall start by demanding that our 
theory is invariant under the space-time-dependent phase transformation 


(a, t) > Y (x,t) =expligx(z, Dl, t). (2.48) 


We shall demonstrate that such a phase invariance is not possible for a free 
theory, but rather requires an interacting theory involving a (4-vector) field 
whose interactions with the charged particle are precisely determined, and 
which undergoes the transformation 


As A A+ Vx (2.49) 
Viv = V-0x/0t (2.50) 


when Y% > w’. The demand of this type of phase invariance will have then 
dictated the form of the interaction — this is the basis of the gauge principle. 

Before proceeding we note that the resulting equation — which will of course 
turn out to be (2.29) — will not strictly speaking be invariant under (2.48), 
but rather covariant (in the gauge sense), as we saw in the preceding section. 
Nevertheless, we shall in this section sometimes continue (slightly loosely) to 
speak of ‘local phase invariance’. When we come to implement these ideas 
in quantum field theory in chapter 7 (section 7.4), using the Lagrangian for- 
malism, we shall see that the relevant Lagrangians are indeed invariant under 
(2.48). 

We therefore focus attention on the phase of the wavefunction. The abso- 
lute phase of a wavefunction in quantum mechanics cannot be measured; only 
relative phases are measurable, via some sort of interference experiment. A 
simple example is provided by the diffraction of particles by a two-slit system. 
Downstream from the slits, the wavefunction is a coherent superposition of 
two components, one originating from each slit: symbolically, 


Y = pı + Y2. (2.51) 


The probability distribution ||? will then involve, in addition to the separate 
intensities ||? and |w2|?, the interference term 


2 Re(Wj} we) = 2]u1||W2| cos d (2.52) 


where 6 (= 61 —62) is the phase difference between components 4/1 and Y2. The 
familiar pattern of alternating intensity maxima and minima is then attributed 
to variation in the phase difference 6. Where the components are in phase, 
the interference is constructive and ||? has a maximum; where they are out 
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of phase, it is destructive and |w|? has a minimum. It is clear that if the 
individual phases 6; and 62 are each shifted by the same amount, there will 
be no observable consequences, since only the phase difference 6 enters. 

The situation in which the wavefunction can be changed in a certain way 
without leading to any observable effects is precisely what is entailed by a 
symmetry or invariance principle in quantum mechanics. In the case under 
discussion, the invariance is that of a constant overall change in phase. In 
performing calculations it is necessary to make some definite choice of phase; 
that is, to adopt a ‘phase convention’. The invariance principle guarantees 
that any such choice, or convention, is equivalent to any other. 

Invariance under a constant change in phase is an example of a global 
invariance, according to the terminology introduced in the previous section. 
We make this point quite explicit by writing out the transformation as 


Poy = i 


global phase invariance. (2.53) 
a = constant 


That a in (2.53) is a constant, the same for all space-time points, expresses 
the fact that once a phase convention (choice of a) has been made at one 
space-time point, the same must be adopted at all other points. Thus in 
the two-slit experiment we are not free to make a local chance of phase: for 
example, as discussed by ’t Hooft (1980), inserting a half-wave plate behind 
just one of the slits will certainly have observable consequences. 

There is a sense in which this may seem an unnatural state of affairs. Once 
a phase convention has been adopted at one space-time point, the same con- 
vention must be adopted at all other ones: the half-wave plate must extend 
instantaneously across all of space, or not at all. Following this line of thought, 
one might then be led to ‘explore the possibility’ of requiring invariance under 
local phase transformations: that is, independent choices of phase convention 
at each space-time point. By itself, the foregoing is not a compelling mo- 
tivation for such a step. However, as we pointed out in section 2.3, such a 
move from a global to a local invariance is apparently of crucial significance 
in classical electromagnetism and general relativity, and seems now to provide 
the key to an understanding of the other interactions in the Standard Model. 
Let us see, then, where the demand of ‘local phase invariance’ 


y(x, t) > y (x,t) = explia(ax, t)]y)(ax, t) local phase invariance (2.54) 


leads us. 

There is immediately a problem: this is not an invariance of the free- 
particle Schródinger equation or of any free-particle relativistic wave equation! 
For example, if the original wavefunction 7(a,t) satisfied the free-particle 
Schrodinger equation 

2 (iVe, t) = iðy(æ,t)/ðt (2.55) 


2m 
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then the wavefunction 4”, given by the local phase transformation, will not, 
since both V and 0/0t now act on a(a, t) in the phase factor. Thus local phase 
invariance is not an invariance of the free-particle wave equation. If we wish 
to satisfy the demands of local phase invariance, we are obliged to modify the 
free-particle Schrodinger equation into something for which there is a local 
phase invariance — or rather, more accurately, a corresponding covariance. 
But this modified equation will no longer describe a free particle: in other 
words, the freedom to alter the phase of a charged particle’s wavefunction 
locally is only possible if some kind of force field is introduced in which the 
particle moves. In more physical terms, the covariance will now be manifested 
in the inability to distinguish observationally between the effect of making a 
local change in phase convention and the effect of some new field in which the 
particle moves. 

What kind of field will this be? In fact, we know immediately what the 
answer is, since the local phase transformation 


Y — 4 = explia(æx, tY (2.56) 


with a = qx is just the phase transformation associated with electromagnetic 
gauge invariance! Thus we must modify the Schródinger equation 


1 ee 
mV) w =10/0t (2.57) 
to 1 
z Y — qA)* = (19/0t — qV (2.58) 
and satisfy the local phase invariance 
p > y = explia(a, t)]y (2.59) 


by demanding that A and V transform by 
ASA =A+9 Va 


(2.60) 
V => V' =V —q`t0a/ðt 
when 7 — 4”. The modified wave equation is of course precisely the Schrödinger 
equation describing the interaction of the charged particle with the electro- 
magnetic field described by A and V. 

In a Lorentz covariant treatment, A and V will be regarded as parts of a 
4-vector A, just as —V and 0/0t are parts of 0” (see problem 2.1). Thus the 
presence of the vector field A“, interacting in a ‘universal’ prescribed way with 
any particle of charge q, is dictated by local phase invariance. A vector field 
such as A“, introduced to guarantee local phase invariance, is called a ‘gauge 
field’. The principle that the interaction should be so dictated by the phase 
(or gauge) invariance is called the gauge principle: it allows us to write down 
the wave equation for the interaction directly from the free particle equation 
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via the replacement (2.44)?. As before, the method clearly generalizes to the 
four-dimensional case. 


E a 


2.6 Comments on the gauge principle in electromagnetism 
Comment (i) 


A properly sceptical reader may have detected an important sleight of hand in 
the previous discussion. Where exactly did the electromagnetic charge appear 
from? The trouble with our argument as so far presented is that we could 
have defined fields A and V so that they coupled equally to all particles — 
instead we smuggled in a factor q. 

Actually we can do a bit better than this. We can use the fact that the 
electromagnetic charge is absolutely conserved to claim that there can be no 
quantum mechanical interference between states of different charge q. Hence 
different phase changes are allowed within each ‘sector’ of definite q: 


Y" = exp(iqx)V (2.61) 


let us say. When this becomes a local transformation, x > x(æ, t), we shall 
need to cancel a term qV x, which will imply the presence of a ‘—qA’ term, 
as required. Note that such an argument is only possible for an absolutely 
conserved quantum number q — otherwise we cannot split up the states of 
the system into non-communicating sectors specified by different values of g. 
Reversing this line of reasoning, a conservation law such as baryon number 
conservation, with no related gauge field, would therefore now be suspected 
of not being absolutely conserved. 

We still have not tied down why q is the electromagnetic charge and not 
some other absolutely conserved quantum number. A proper discussion of 
the reasons for identifying A” with the electromagnetic potential and q with 
the particle’s charge will be given in chapter 7 with the help of quantum field 
theory. 


Comment (ii) 


Accepting these identifications, we note that the form of the interaction con- 
tains but one parameter, the electromagnetic charge q of the particle in ques- 
tion. It is the same whatever the type of particle with charge g, whether it 
be lepton, hadron, nucleus, ion, atom, etc. Precisely this type of ‘universal- 
ity’ is present in the weak couplings of quarks and leptons, as we shall see in 
volume 2. This strongly suggests that some form of gauge principle must be 


3 Actually the electromagnetic interaction is uniquely specified by this procedure only 
for particles of spin-0 or 3. The spin-1 case will be discussed in volume 2. 
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at work in generating weak interactions as well. The associated symmetry or 
conservation law is, however, of a very subtle kind. Incidentally, although all 
particles of a given charge q interact electromagnetically in a universal way, 
there is nothing at all in the preceding argument to indicate why, in nature, 
the charges of observed particles are all integer multiples of one basic charge. 


Comment (iii) 


Returning to comment (i), we may wish that we had not had to introduce the 
absolute conservation of charge as a separate axiom. As remarked earlier, at 
the end of section 2.2, we should like to relate that conservation law to the 
symmetry involved, namely invariance under (2.54). It is worth looking at the 
nature of this symmetry in a little more detail. It is not a symmetry which 
— as in the case of translation and rotation invariances for instance — involves 
changes in the space-time coordinates x and t. Instead, it operates on the 
real and imaginary parts of the wavefunction. Let us write 


Y = Yr + br. (2.62) 


Then 
Y =p = oR +11 (2.63) 


can be written as 


Yr = (cos a) Ya — (sin apr 
vy = (sin a) ya + cos a) 


from which we can see that it is indeed a kind of ‘rotation’, but in the WR-Yr 
plane, whose ‘coordinates’ are the real and imaginary parts of the wavefunc- 
tion. We call this plane an internal space and the associated symmetry an 
internal symmetry. Thus our phase invariance can be looked upon as a kind 
of internal space rotational invariance. 

We can imagine doing two successive such transformations 


(2.64) 


voy oy" (2.65) 
where 
y" = Py (2.66) 
and so | | 
y" = Adley = e? (2.67) 


with 6 =0+8. This is a transformation of the same form as the original one. 
The set of all such transformations forms what mathematicians call a group, 
in this case U(1), meaning the group of all unitary one-dimensional matrices. 
A unitary matrix U is one such that 


UU! =UtU=1 (2.68) 


where 1 is the identity matrix and Y denotes the Hermitian conjugate. A 
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one-dimensional matrix is of course a single number — in this case a complex 
number. Condition (2.68) limits this to being a simple phase: the set of phase 
factors of the form ei“, where a is any real number, form the elements of a 
U(1) group. These are just the factors that enter into our gauge (or phase) 
transformations for wavefunctions. Thus we say that the electromagnetic 
gauge group is U(1). We must remember, however, that it is a local U(1), 
meaning (cf (2.54)) that the phase parameters a, 3,... depend on the space- 
time point zx. 

The transformations of the U(1) group have the simple property that it 
does not matter in what order they are performed: referring to (2.65)—(2.67), 
we would have got the same final answer if we had done the 8 ‘rotation’ first 
and then the a one, instead of the other way around; this is because, of course, 


exp(ia) - exp(i8) = expli(a + 6)] = exp(i9) - exp(ia). (2.69) 


This property remains true even in the ‘local’ case when a and 8 depend 
on x. Mathematicians call U(1) an Abelian group: different transformations 
commute. We shall see later (in volume 2) that the ‘internal’ symmetry spaces 
relevant to the strong and weak gauge invariances are not so simple. The 
‘rotations’ in these cases are more like full three-dimensional rotations of real 
space, rather than the two-dimensional rotation of (2.64). We know that, in 
general, such real-space rotations do not commute, and the same will be true 
of the strong and weak rotations. Their gauge groups are called non-Abelian. 

Once again, we shall have to wait until chapter 7 before understanding 
how the symmetry represented by (2.63) is really related to the conservation 
law of charge. 


Comment (iv) 


The attentive reader may have picked up one further loose end. The vector 
potential A is related to the magnetic field B by 


B=VWxA. (2.70) 
Thus if A has the special form 
A=Vf (2.71) 


B will vanish. The question we must answer, therefore, is: how do we know 
that the A field introduced by our gauge principle is not of the form (2.71), 
leading to a trivial theory (B = 0)? The answer to this question will lead us 
on a very worthwhile detour. 

The Schrodinger equation with V f as the vector potential is 
E, — qV fp = Ey. (2.72) 
2m 


We can write the formal solution to this equation as 


vea (a [vf a) -wir=0) (2.73) 
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which may be checked by using the fact that 


O a 
=| f(t)dt = f(a). (2.74) 


The notation W(f = 0) means just the free-particle solution with f = 0; the 
line integral is taken along an arbitrary path ending in the point x. But we 


have af af af 
= — — — = . . 2: 
df Be py oa Vf-dl (2.75) 
Hence the integral can be done trivially and the solution becomes 
= explig(f(@) — f(—00))]: p(f = 0). (2.76) 


We say that the phase factor introduced by the (in reality, field-free) vector 
potential A = Vf is integrable: the effect of this particular A is merely 
to multiply the free-particle solution by an a-dependent phase (apart from 
a trivial constant phase). Since this A should give no real electromagnetic 
effect, we must hope that such a change in the wavefunction is also somehow 
harmless. Indeed Dirac showed (Dirac 1981, pp 92-3) that such a phase 
factor corresponds merely to a redefinition of the momentum operator p. The 
essential point is that (in one dimension, say) p is defined ultimately by the 
commutator (fi = 1) 


[2,5] =i. (2.77) 
Certainly the familiar choice 
o 
p= —i— 2.78 
p= -iz (2.78) 


satisfies this commutation relation. But we can also add any function of x 
to p, and this modified p will be still satisfactory since « commutes with 
any function of x. More detailed considerations by Dirac showed that this 
arbitrary function must actually have the form 0F/0x, where F is arbitrary. 


Thus o oF 
p = —i— + — 2.79 
a "Ox si Ox ( ) 
is an acceptable momentum operator. Consider then the quantum mechanics 
defined by the wavefunction w(f = 0) and the momentum operator p = 


—ið/ðx. Under the unitary transformation (cf (2.76)) 
Vf =0) + elt! pf = 0) (2.80) 
p will be transformed to 
po dia. (2.81) 


But the right-hand side of this equation is just p — gOf/Ox (problem 2.3), 
which is an equally acceptable momentum operator, identifying qf with the 
F of Dirac. Thus the case A = Vf is indeed equivalent to the field-free case. 
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FIGURE 2.1 
Two paths Cı and C2 (in two dimensions for simplicity) from —oo to the point 
£. 


What of the physically interesting case in which A is not of the form V f? 
The equation is now 


(IV — 049 = By (2.82) 


to which the solution is 
x 
Y = exp (uf A- ar) -wW(A = 0). (2.83) 


The line integral can now not be done so trivially: one says that the A-field 
has produced a non-integrable phase factor. There is more to this terminology 
than the mere question of whether the integral is easy to do. The crucial point 
is that the integral now depends on the path followed in reaching the point æ, 
whereas the integrable phase factor in (2.73) depends only on the end-points 
of the integral, not on the path joining them. 

Consider two paths Cı and Cə (figure 2.1) from —oo to the point æ. The 
difference in the two line integrals is the integral over a closed curve C, which 
can be evaluated by Stokes” theorem: 


T T 
A-dl— Ad=$Ad=[/vxa-as=/f Bas (2.84) 
Ca Ca e Ss S 


where S is any surface spanning the curve C. In this form we see that if A = 
V f, then indeed the line integrals over Cı and C2 are equal since V x Vf = 0, 
but if B = Vx A is not zero, the difference between the integrals is determined 
by the enclosed flux of B. 

This analysis turns out to imply the existence of a remarkable phenomenon 
— the Aharonov-Bohm effect, named after its discoverers (Aharonov and Bohm 
1959). Suppose we go back to our two-slit experiment of section 2.5, only this 
time we imagine that a long thin solenoid is inserted between the slits, so 
that the components Yı and 43 of the split beam pass one on each side of 
the solenoid (figure 2.2). After passing round the solenoid, the beams are 
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FIGURE 2.2 
The Aharonov-Bohm effect. 


recombined, and the resulting interference pattern is observed downstream. 
At any point x of the pattern, the phase of the 41 and 43 components will be 
modified — relative to the B = 0 case — by factors of the form (2.83). These 
factors depend on the respective paths, which are different for the two com- 
ponents 41 and Y2. The phase difference between these components, which 
determines the interference pattern, will therefore involve the B-dependent 
factor (2.84). Thus, even though the field B is essentially totally contained 
within the solenoid, and the beams themselves have passed through B = 0 
regions only, there is nevertheless an observable effect on the pattern provided 
B 70! This effect — a shift in the pattern as B varies — was first confirmed ex- 
perimentally by Chambers (1960), soon after its prediction by Aharonov and 
Bohm. It was anticipated in work by Ehrenburg and Siday (1949); further 
references and discussion are contained in Berry (1984). 


Comment (v) 


In conclusion, we must emphasize that there is ultimately no compelling logic 
for the vital leap to a local phase invariance from a global one. The latter is, 
by itself, both necessary and sufficient in quantum field theory to guarantee 
local charge conservation. Nevertheless, the gauge principle — deriving inter- 
actions from the requirement of local phase invariance — provides a satisfying 
conceptual unification of the interactions present in the Standard Model. In 
volume 2 of this book we shall consider generalizations of the electromagnetic 
gauge principle. It will be important always to bear in mind that any at- 
tempt to base theories of non-electromagnetic interactions on some kind of 
gauge principle can only make sense if there is an exact symmetry involved. 
The reason for this will only become clear when we consider the renormaliz- 
ability of QED in chapter 11. 
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Problems 


2.1 


a) A Lorentz transformation in the æ! direction is given b 
g y 


(b) 


2.2 How many independent components does the field strength PP” have? 
Express each component in terms of electric and magnetic field components. 
Hence verify that equation (2.18) correctly reproduces both equations (2.1) 


t = y(t-vur!) 
xr” = y(—ut+a?) 
x?! = a? qa = qe 


? 


where y = (1 — v2)-1/2 and c = 1. Write down the inverse of this 
transformation (i.e. express (t, 21) in terms of (t”, x1”)), and use the 
‘chain rule’ of partial differentiation to show that, under the Lorentz 
transformation, the two quantities (9/0t, —-0/0x*) transform in the 
same way as (t, xt). 


[The general result is that the four-component quantity (9/9, 
—0/dx', -0/0x?, -9/033) = (0/0t, —V) transforms in the same 
way as (t,x!,x?,a°). Four-component quantities transforming this 
way are said to be ‘contravariant 4-vectors’, and are written with 
an upper 4-vector index; thus (0/0t,-—V) = ð”. Upper indices 
can be lowered by using the metric tensor gu», see appendix D, 
which reverses the sign of the spatial components. Thus 04 = 
(0/0t, 0/021, 0/0x2,0/0x3). Similarly the four quantities (0/0t, V) 
= (9/0t,0/0x*,0/0x?,0/0x3) transform as (t, —x!, —x?, —x) and 
are a ‘covariant 4-vector’, denoted by 0,,.] 


Check that equation (2.5) can be written as (2.17). 


and (2.8). 
2.3 Verify the result 


cial (©) peia) = p — ace, 
ax 


3 


Relativistic Quantum Mechanics 


It is clear that the non-relativistic Schródinger equation is quite inadequate 
to analyse the results of experiments at energies far higher than the rest 
mass energies of the particles involved. Besides, the quarks and leptons have 


1 


spin-5, a degree of freedom absent from the Schrödinger wavefunction. We 


therefore need two generalizations — from non-relativistic to relativistic for 


spin-0 particles, and from spin-0 to spin-2. The first step is to the Klein— 


Gordon equation (section 3.1), the second to the Dirac equation (section 3.2). 
Then after some further work on solutions of the Dirac equation (sections 3.3- 
3.4), we shall consider (section 3.5) some simple consequences of including the 
electromagnetic interaction via the gauge principle replacement (2.44). 


E  ——————— ooo ———— = 


3.1 The Klein—Gordon equation 


The non-relativistic Schrödinger equation may be put into correspondence 
with the non-relativistic energy-momentum relation 


E = p?/2m (3.1) 

by means of the operator replacements! 
E — id/dt (3.2) 
p > -iV, (3.3) 


these differential operators being understood to act on the Schrödinger wave- 
function. 

For a relativistic wave equation we must start with the correct relativistic 
energy-momentum relation. Energy and momentum appear as the ‘time’ and 
‘space’ components of the momentum 4-vector 


p = (E, p) (3.4) 
which satisfy the mass-shell condition 
p = Dup = E? — p* =m. (3.5) 


1Recall A = c = 1 throughout (see appendix B). 
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Since energy and momentum are merely different components of a 4-vector, 
an attempt to base a relativistic theory on the relation 
E = +(p? + m2)1/2 (3.6) 


is unattractive, as well as having obvious difficulties in interpretation for the 
square root operator. Schródinger, before settling for the less ambitious non- 
relativistic Schródinger equation, and later Klein and Gordon, attempted to 
build relativistic quantum mechanics (RQM) from the squared relation 


E? =p +m’. (3.7) 
Using the operator replacements for E and p we are led to 
09/08 = (—V2 + m2 (3.8) 


which is the Klein—Gordon equation (KG equation). We consider the case of a 
one-component scalar wavefunction ¢(a, t): one expects this to be appropriate 
for the description of spin-0 bosons. 


3.1.1 Solutions in coordinate space 


In terms of the D’Alembertian operator 


— H 9 2 
= 040! = ap vV (3.9) 
the KG equation reads: 
(O + m?)o(æ,t) = 0. (3.10) 


Let us look for a plane-wave solution of the form 
læ, t) = Ne PHP — peir? (3.11) 


where we have written the exponent in suggestive 4-vector scalar product 
notation 


and N is a normalization factor which need not be decided upon here (see sec- 
tion 8.1.1). In order that this wavefunction be a solution of the KG equation, 
we find by direct substitution that E must be related to p by the condition 


E? =p? + m?. (3.13) 


This looks harmless enough, but it actually implies that for a given 3-momentum 
p there are in fact two possible solutions for the energy, namely 


E = H(p? +m2)7. (3.14) 


As Schródinger and others quickly found, it is not possible to ignore the nega- 
tive solutions without obtaining inconsistencies. What then do these negative- 
energy solutions mean? 
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3.1.2 Probability current for the KG equation 


In exactly the same way as for the non-relativistic Schrodinger equation, it 
is possible to derive a conservation law for a ‘probability current’ of the KG 
equation. We have 

Po 244mg =0 (3.15) 

at? a 
and by multiplying this equation by ¢*, and subtracting ¢ times the com- 
plex conjugate of equation (3.15), one obtains, after some manipulation (see 
problem 3.1), the result 


Op SI 
where 06 oo 
„= |e (20) ca 
and 
j =i 9 Vo—(Vé)o] (3.18) 


(the derivatives (0,,¢*) act only within the bracket). In explicit 4-vector no- 
tation this conservation condition reads (cf problem 2.1 and equation (D.4) 
in appendix D) 

dj =0 (3.19) 


with 

j" = (p, j) = il a"o — (O"S")d). (3.20) 
Since ¢ of (3.11) is Lorentz invariant and 0” is a contravariant 4-vector, equa- 
tion (3.20) shows explicitly that j” is a contravariant 4-vector, as anticipated 
in the notation. 

The spatial current 3 is identical in form to the Schródinger current, but 
for the KG case the ‘probability density’ now contains time derivatives since 
the KG equation is second order in 0/0t. This means that p is not constrained 
to be positive definite — so how can p represent a probability density? We can 
see this problem explicitly for the plane-wave solutions 


d Ne rt (3.21) 


which give (problem 3.1) 
p= 2|N PE (3.22) 


and E can be positive or negative: that is, the sign of p is the sign of energy. 

Historically, this problem of negative probabilities coupled with that of 
negative energies led to the abandonment of the KG equation. For the mo- 
ment we will follow history, and turn to the Dirac equation. We shall see in 
section 3.4, however, how the negative-energy solutions of the KG equation 
do after all have a role to play, following Feynman’s interpretation, in pro- 
cesses involving antiparticles. Later, in chapters 5-7, we shall see how this 
interpretation arises naturally within the formalism of quantum field theory. 
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3.2 The Dirac equation 


In the case of the KG equation it is clear why the problem arose: 


(i) In constructing a wave equation in close correspondence with the 
squared energy-momentum relation 


E? =p +m? 


we immediately allowed negative-energy solutions. 


(ii) The KG equation has a 0?/0t? term: this leads to a continuity 
equation with a ‘probability density’ containing 9/9, and hence to 
negative probabilities. 


Dirac approached these problems in his characteristically direct way. In 
order to obtain a positive-definite probability density p > 0, he required an 
equation linear in 0/0t. Then, for relativistic covariance (see chapter 4), the 
equation must also be linear in V. He postulated the equation (Dirac 1928) 


jue.) t) zZ E (aug + 0233 + 0072) + Bm p(x, t) 
= (ia: V+8m)w(x, t). (3.23) 


What are the a’s and 5? To find the conditions on the a’s and 8, consider 
what we require of a relativistic wave equation: 


(i) the correct relativistic relation between E and p, namely 


E =+(p? + m?) 


(1i) the equation should be covariant under Lorentz transformations. 


We shall postpone discussion of (ii) until the following chapter. To solve 
requirement (i), Dirac in fact demanded that his wavefunction y satisfy, in 
addition, a KG-type condition 


94 /0t? = (V? + m? yy. (3.24) 


We note with hindsight that we have once more opened the door to negative- 
energy solutions: Dirac’s remarkable achievement was to turn this apparent 
defect into one of the triumphs of theoretical physics! 

We can now derive conditions on a and 8. We have 


id /Ot = (—ia- V + Bm) (3.25) 
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and so, squaring the operator on both sides, 


i=1 1,j=1 
i>j 
S pitas ape 
im) (aib + Bau) 5 + 8 my. (3.26) 
i=1 


But by our assumption that w also satisfies the KG condition, we must have 


aN? 3 8y 


It is thus evident that the a’s and 8 cannot be ordinary, classical, commuting 
quantities. Instead they must satisfy the following anticommutation relations 
in order to eliminate the unwanted terms on the right-hand side of equation 
(3.26): 


ap + Ba, = 0 1=1,2,3 (3.28) 
aja; taja; = 0 wI = 1,2 Bete. (3.29) 

In addition we require 
a (3.30) 


Dirac proposed that the a’s and P should be interpreted as matrices, acting 
on a wavefunction which had several components arranged as a column vector. 
Anticipating somewhat the results of the next section, we would expect that, 
since each such component obeys the same wave equation, the physical states 
which they represent would have the same energy. This would mean that the 
different components represent some degeneracy, associated with a new degree 
of freedom. 

The degree of freedom is, of course, spin — an entirely quantum mechani- 
cal angular momentum, analogous to (but not equivalent to) orbital angular 
momentum. Consider, for example, the wavefunctions for the 2p state in the 
simple non-relativistic theory of the hydrogen atom. There are three of them, 
all degenerate with energy given by the n = 2 Bohr energy. The three corre- 
sponding states all have orbital angular momentum quantum number / equal 
to 1; they differ in their values of the ‘magnetic’ quantum number m (i.e. 
the eigenvalue of the z-component of the orbital angular momentum operator 
L.). Specifically, these three wavefunctions have the form (omitting normal- 
ization constants) (r sin 0el?, r sin Qe-i*, r cos @)e—"/?"8 , where rg is the Bohr 
radius. Remembering the expressions for the Cartesian coordinates x, y and z 
in terms of the spherical polar coordinates r, 0 and ¢, we see that by a suitable 
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linear combination (always allowed for degenerate states) we can write these 
wavefunctions as (x,y,z) f(r), where again a normalization factor has been 
omitted. In this form it is plain that the multiplicity of the p-state wavefunc- 
tions can be interpreted in simple geometrical terms: they are effectively the 
components of a vector (multiplication by the scalar function f(r) does not 
affect this). 

The several components of the Dirac wavefunction together make up a 
similar, but quite distinct, object called a spinor. We shall have more to say 
about this in chapter 4. For the moment we continue with the problem of 
finding the matrices a; and f to satisfy (3.28)-(3.30). 

As problem 3.2 shows, the smallest possible dimension of the matrices for 
which the Dirac conditions can be satisfied is 4 x 4. One conventional choice 


of the a’s and £ is 
0 Oi z 1 0 
(2%) pa(t 9) aa 


where we have written these 4 x 4 matrices in 2 x 2 ‘block diagonal’ form, the 
o;’s are the 2 x 2 Pauli matrices, 1 is the 2 x 2 unit matrix, and O is the 2 x 2 
null matrix. The Pauli matrices (see appendix A) are defined by 


w= (y > y= (5 0) r= (5 au (3.32) 


Readers unfamiliar with the labour-saving ‘block’ form of (3.31) should verify, 
both by using the corresponding explicit 4 x 4 matrices, such as 


0 
(3.33) 


0 0 1 
TAA 0010 
0 1 0 0 
1 0 0 0 
and so on, and by the block diagonal form, that this choice does indeed satisfy 
the required conditions. These are 


las, PB) = 0 (3.34) 
{aiaj} = 26;51 (3.35) 
2 = 1 (3.36) 


where {A,B} is the anticommutator of two matrices, AB + BA, and 1 is 
here the 4 x 4 unit matrix. 

At this point we can already begin to see that the extra multiplicity is 
very likely to have something to do with an angular momentum-like degree of 
freedom. In fact, if we define the spin matrices S by S = $0 (h = 1), we find 
from (3.32) that 

[Sz, Sy] = 18, (3.37) 
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(with obvious cyclic permutations), which are precisely the commutation re- 
lations satisfied by the components des de and J. of the angular momentum 
operator J in quantum mechanics (see appendix A). Furthermore, the eigen- 
values of S, are +4, and of S° are s(s +1) with s = 4. So these matrices 
undoubtedly represent quantum mechanical angular momentum operators, 
appropriate to a state with angular momentum quantum number 7 = 3. This 
is precisely what ‘spin’ is. We will discuss this in more detail in section 3.3. 
It is important to note that the choice (3.31) of a and £ is not unique. In 
fact, all matrices related to these by any unitary 4 x 4 matrix U (which thus 


preserves the anticommutation relations) are allowed: 
a = UU? (3.38) 
pl = UBU. (3.39) 


Another commonly used representation is provided by the matrices 


a=($ =) s= (3 ie (3.40) 


The reader may check (problem 3.2) that these matrices also satisfy (3.34)- 
(3.36). 

Unless otherwise stated, we shall use the standard representation (3.31). 
This is generally convenient for ‘low energy’ applications — that is, when the 
momentum |p| is significantly smaller than the mass m. In that case, 3m will 
be the largest term in the Dirac Hamiltonian (see (3.23)), and it is sensible 
to have it in diagonal form. The choice (3.40), by contrast, is more natural 
when the mass is small compared with the energy or momentum. 


3.2.1 Free-particle solutions 


Since the Dirac Hamiltonian now involves 4 x 4 matrices, it is clear that we 
must interpret the Dirac wavefunction 4 as a four-component column vector — 
the so-called Dirac spinor. Let us look at the explicit form of the free-particle 
solutions. As in the KG case, we look for solutions in which the space-time 
behaviour is of plane-wave form and put 


Y =we Pz (3.41) 


where w is a four-component spinor independent of x, and eP", with p” = 
(E, p), is the plane-wave solution corresponding to 4-momentum p”. We sub- 
stitute this into the Dirac equation 


¡0y/0t= (—ia- V + Bm) (3.42) 


using the explicit a and 8 matrices. In order to use the 2 x 2 block form, it is 
conventional (and convenient) to split the spinor w into two two-component 


spinors ¢ and x: 
w= Es : (3.43) 
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We obtain the matrix equation (see problem 3.3) 


e) = ee a Lo (3.44) 


representing two coupled equations for ¢ and x: 
(E — m)ó=0 -px (3.45) 


and 
(E +m)x =0 - po. (3.46) 


Solving for x from (3.46), the general four-component spinor may be written 
(without worrying about normalization for the moment) 


f (3.47) 
Ww = g:p . 3.47 
Erm 


What is the relation between E and p for this to be a solution of the Dirac 
equation? If we substitute x from (3.46) into (3.45) and remember that (prob- 
lem 3.4) 

(ap)? = p1 (3.48) 


we find that 
(E—m)(E+m)¢ = p’¢ (3.49) 


for any ¢. Hence we arrive at the same result as for the KG equation in that 
for a given value of p, two values of E are allowed: 


= +(p? + m)? (3.50) 


i.e. positive and negative solutions are still admitted. 
The Dirac equation does not therefore solve this problem. What about 
the probability current? 


3.2.2 Probability current for the Dirac equation 
Consider the following quantity which we denote (suggestively) by p: 


p = vitele). (3.51) 


Here yt is the Hermitian conjugate row vector of the column vector 4. In 
terms of components 
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SO 


4 
p=% a? > 0 (3.53) 
a=1 


and we see that p is a scalar density which is explicitly positive-definite. This 
is one property we require of a probability density: in addition, we require 
a conservation law, coming from the Dirac equation, and a corresponding 
probability current density. In fact (see problem 3.5) we can demonstrate, 
using the Dirac equation, 


id) /Ot = (—ia - V + Bm)w (3.54) 
and its Hermitian conjugate 
~idyt = pt (tia. V + Bm) (3.55) 
that there is a conservation law of the required form 
Op/Ot+V -j =0. (3.56) 


The notation ut? requires some comment: it is shorthand for three row 
matrices 


WT, = 041 /0x etc. 


(recall that Vi is a row matrix). 
In equation (3.56), with p being given by (3.51), the probability current 
density j is 
j(a) = vi (2)ap(<) (3.57) 


representing a 3-vector with components 


(dard, dad, pasy). (3.58) 


We therefore have a positive-definite p and an associated j satisfying the 
required conservation law (3.56), which, as usual, we can write in invariant 
form as 0,5 = 0, where 


j" = (p,3). (3.59) 


Thus j” is an acceptable probability current, unlike the current for the KG 
equation — as we might have anticipated. 

The form of equation (3.56) implies that j” of (3.59) is a contravariant 
4-vector (cf equation (D.4)), as we verified explicitly in the KG case. The 
corresponding verification is more difficult in the Dirac case, since the Dirac 
spinor w transforms non-trivially under Lorentz transformations, unlike the 
KG wavefunction ¢. We shall come back to this problem in chapter 4. 

We now turn to further discussion of the spin degree of freedom, postponing 
consideration of the negative-energy solutions until section 3.4. 
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E QQ ————— ooo ———— 
3.3 Spin 


Four-momentum is not the only physical property of a particle obeying the 
Dirac equation. We must now interpret the column vector (Dirac spinor) 
part, w, of the solution (3.41). The particular properties of the o-matrices, 
appearing in the a-matrices, have already led us to think in terms of spin. 
A further indication that this is correct comes when we consider the explicit 
form of w given in (3.47). In this equation the two-component spinor ¢ is 
completely arbitrary. It may be chosen in just two linearly independent ways, 


for example 
1 
6=(5) $, = (2) (3.60) 


which (as the notation of course indicates) are in fact eigenvectors of S, = 307 
with eigenvalues +4 (‘up’ and ‘down’ along the z-axis). Remember that, in 
quantum mechanics, linear combinations of wavefunctions can be formed using 
complex numbers as superposition coefficients, in general; so the most general 


@ can always be written as 


$= a = ads + boy (3.61) 


where a and b are complex numbers. Hence, there are precisely two linearly 
independent solutions, for a given 4-momentum, just as we would expect for 
a quantum system with j = 3 (the multiplicity is 27 + 1, in general). 

In the rest frame of the particle (p = 0) this interpretation is straightfor- 
ward. In this case choosing (3.60) for the two independent ¢’s, the solutions 
(3.61) for E = m reduce to 


1 


(3.62) 


Since we have degeneracy between these two solutions (both have E = m) 
there must be some operator which commutes with the energy operator, and 
whose eigenvalues would distinguish the solutions (3.62). In this case the 
energy operator is just 3m (from (3.54) setting —iV to zero, since p = 0) and 
the required operator commuting with f is 


Y, = (i 2) (3.63) 


which has eigenvalues 1 (twice) and —1 (twice). Our rest-frame spinors ap- 
pearing in (3.62) are indeed eigenstates of 2,, with eigenvalues +1 as can be 
easily verified. 
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Generalizing (3.63), we introduce the three matrices X where 


ss ae (3.64) 


Then the operators LE are such that 
[É2,, 324] =132, (3.65) 


and (3%)? = SI where I is now the unit 4 x 4 matrix. These are just the 
properties expected of quantum-mechanical angular momentum operators (see 
appendix A) belonging to magnitude j = 4 (we already know that the eigen- 
values of LY are +3). So we can interpret E as spin-4 operators appropriate 
to our rest-frame solutions; and — at least in the rest frame — we may say that 
the Dirac equation describes a particle of spin-4. 

It seems reasonable to suppose that the magnitude of a spin of a particle 
could not be changed by doing a Lorentz transformation, as would be required 
in order to discuss the spin in a general frame with p 4 0. But E is then 
no longer a suitable spin operator, since it fails to commute with the energy 
operator, which is now (a - p + $m) from (3.54), for a plane-wave solution 
with momentum p. Yet there are still just two independent states for a given 
4-momentum as our explicit solution (3.47) shows: ¢ can still be chosen in 
only two linearly independent ways. Hence there must be some operator 
which does commute with a - p + Bm, and whose eigenvalues can be used to 
distinguish the two states. Actually this condition is not enough to specify 
such an operator uniquely, and several choices are common. One of the most 
useful is the helicity operator h(p) defined by 

o-p 
|p| 
h(p) = acp (3.66) 
0 ati 
|p| 

which (see problem 3.6) does commute with «+ - p + Bm . We can therefore 
choose our general p 7 O states to be eigenstates of h(p). These will be 
called ‘helicity states’: physically they are eigenstates of X resolved along the 
direction of p. 

Using (3.48) it is easy to see that the eigenvalues of h(p) are +1 (twice) 
and —1 (twice). Our general four-component spinor (3.47) is therefore an 
eigenstate of h(p) if 


g-p 
— 0 
PI J ee A on 
ee E+m E+m 
|p| 
Taking the + sign first, this will hold if 
o-p 
—— b+ = $+ (3.68) 


pl 
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where the + subscript has been added to indicate that this is a solution of 
(3.68). Such a d+ is called a two-component helicity spinor. The explicit form 
of ġ can be found by solving (3.68) — see problem 3.7. Similarly, the four- 
component spinor will be an eigenstate of h(p) belonging to the eigenvalue 
—1 if it contains - where 


o-p 
—o_=-¢_. 3.69 
ry $ $ (3.69) 


Again, these two choices + and ġ— are linearly independent. 


E: oo oo RA 


3.4 The negative-energy solutions 


In this section we shall first look more closely at the form of both the positive- 
and negative-energy solutions of the Dirac equation, and we shall then concen- 
trate on the physical interpretation of the negative-energy solutions of both 
the Dirac and the KG equations. 

It will be convenient, from now on, to reserve the symbol ‘E’ for the 
positive square root in (3.50): E = +(p? + m°). The general 4-momentum in 
the plane-wave solution (3.41) will be denoted by p” = (p°, p) where p? may 
be either positive or negative. With this notation equation (3.44) becomes 


d (+) 7 e d Ea (3.70) 


in our original representation for a and /. 


3.4.1 Positive-energy spinors 


For these 
pP = +P +m?) = E>0. (3.71) 


We eliminate x and obtain positive-energy spinors in the form 
go? 


TP araja? 
Bm 


wi? =N (3.72) 


with plig! = ¿214? = 1. We shall now choose N so that for these positive- 
energy solutions wtw = 2E. In this case the spinors will be denoted by u(p, s), 
where (problem 3.8) 


u(p,s) =(E+m)/? | o-p e s=1,2 (3.73) 
E+m 
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and s labels the spin degree of freedom in some suitable way (e.g. the he- 
licity eigenvalues). The complete plane-wave solution y for such a positive 
4-momentum state is then 


Y = u(p,s)eP+* (3.74) 
with ph = (E, p). 


3.4.2 Negative-energy spinors 
Now we look for spinors appropriate to the solution 
pP =-(p +m = —B <0 (3.75) 


(E is always defined to be positive). Consider first what are appropriate 
solutions at rest. We have now 


p=-m p=0 (3.76) 
and 7 
= PY (ml 0 
m (° =| 0o ná x (3.77) 
leading to 
ọ=0. (3.78) 
Thus the two independent negative-energy solutions at rest are just 
T E = ( a | (3.79) 
X 
The solution for finite momentum +p, i.e. for 4-momentum (— E, p), is then 
—0 . P 7 
X 
w(p =—E,p,s)=| Erm (3.80) 
x? 


with x*tx? = 1. However, it is clearly much more in keeping with relativity 
if, in addition to changing the sign of E, we also change the sign of p and 
consider solutions corresponding to negative 4-momentum (—E,—p) = —ph. 
We therefore define 
0-p ye 
w(p° = —E,—p,s)=w4=N| EB +m . (3.81) 
yl? 


Adopting the same N as in (3.73) implies the same normalization (ww = 
2E) for (3.81) as in (3.73); in this case the spinors are called v(p, s) where 
(problem 3.8) 


v(p,s) =(E+m)¥/?2 | E+m s=1,2. (3.82) 
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Positive-energy 
continuum E > m 


Negative-energy 
continuum E < —m 


FIGURE 3.1 
Energy levels for Dirac particle. 


(There is a small subtlety in the choice of x! and x? which we will come to 
shortly.) The solution 4 for such negative 4-momentum states is then 


Y = v(p, sje P+)* = v(p, se”. (3.83) 


3.4.3 Dirac’s interpretation of the negative-energy solutions 
of the Dirac equation 


The physical interpretation of the positive-energy solution (3.74) is straight- 
forward, in terms of the p and j given in section 3.2.2. They describe spin-4 
particles with 4-momentum (E, p) and spin appropriate to the choice of 4°; p 
and the energy p° are both positive. 

Unfortunately p is also positive for the negative-energy solutions (3.83), 
so we cannot eliminate them on that account. This means that for a free 
Dirac particle (e.g. an electron) the available positive- and negative-energy 
levels are as shown in figure 3.1. This, in turn, implies that a particle with 
initially positive energy can ‘cascade down’ through the negative-energy levels, 
without limit; in this case no stable positive-energy state would exist! 

In order to prevent positive-energy electrons making transitions to the 
lower, negative-energy states, Dirac postulated that the normal ‘empty’, or 
‘vacuum’, state — that with no positive-energy electrons present — is such that 
all the negative-energy states are filled with electrons. The Pauli exclusion 
principle then forbids any positive-energy electrons from falling into these 
lower energy levels. The ‘vacuum’ now has infinite negative charge and energy, 
but since all observations represent finite fluctuations in energy and charge 
with respect to this vacuum, this leads to an acceptable theory. For example, 
if one negative-energy electron is absent from the Dirac sea, we have a ‘hole’ 
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relative to the normal vacuum: 


energy of ‘hole’ 


II 


—(Eneg) — positive energy 
charge of ‘hole’ =  —(qe) > positive charge. 


Thus the absence of a negative-energy electron is equivalent to the presence of 
a positive-energy positively charged version of the electron, that is a positron. 
In the same way, the absence of a ‘spin-up’ negative-energy electron is equiva- 
lent to the presence of a ‘spin-down’ positive-energy positron. This last point 
is the reason for the subtlety in the choice of x* mentioned after (3.82): we 


choose 
eO O 


the opposite way round from the choice for the positive-energy spinors (3.73). 

Dirac’s brilliant re-interpretation of (unfilled) negative-energy solutions in 
terms of antiparticles is one of the triumphs of theoretical physics?: Carl 
Anderson received the Nobel Prize for his discovery of the positron in 1932 
(Anderson 1932). 

In this way it proved possible to obtain sensible results from the Dirac 
equation and its negative-energy solutions. It is clear, however, that the theory 
is no longer really a ‘single-particle’ theory, since we can excite electrons from 
the infinite ‘sea’ of filled negative-energy states that constitute the normal 
‘empty state’. For example, if we excite one negative-energy electron to a 
positive-energy state, we have in the final state a positive-energy electron plus 
a positive-energy positron ‘hole’ in the vacuum: this corresponds physically to 
the process of ete” pair creation. Thus this way of dealing with the negative- 
energy problem for fermions leads us directly to the need for a quantum field 
theory. The appropriate formalism will be presented later, in section 7.2. 


3.4.4 Feynman’s interpretation of the negative-energy 
solutions of the KG and Dirac equations 


It is clear that despite its brilliant success for spin-3 particles, Dirac’s inter- 
pretation cannot be applied to spin-0 particles, since bosons are not subject to 
the exclusion principle. Besides, spin-0 particles also have their corresponding 
antiparticles (e.g. m+ and 77), and so do spin-l particles (Wt and W-, for 
instance). A consistent picture for both bosons and fermions does emerge 
from quantum field theory, as we shall see in chapters 5-7, which is perhaps 
one of the strongest reasons for mastering it. Nevertheless, it is useful to have 
an alternative, non-field-theoretic, interpretation of the negative-energy solu- 
tions which works for both bosons and fermions. Such an interpretation is due 


2 At that time, this was not universally recognized. For example, Pauli (1933) wrote: 
‘Dirac has tried to identify holes with antielectrons. .. we do not believe that this explanation 
can be seriously considered.’ 
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to Feynman: in essence, the idea is that the negative 4-momentum solutions 
will be used to describe antiparticles, for both bosons and fermions. 

We begin with bosons — for example pions, which for the present purposes 
we take to be simple spin-0 particles whose wavefunctions obey the KG equa- 
tion. We decide by convention that the 7* is the ‘particle’. We will then 
have 


positive 4-momentum 7? solutions: Ne?" (3.85) 

negative 4-momentum 7? solutions: Ne?” (3.86) 

where p” = [(m? + p?)!/?, p]. The electromagnetic current for a free physical 

(positive-energy) 7? is given by the probability current for a positive-energy 
solution multiplied by the charge Q(= +e): 

je. (r+) = (+e) x (probability current for positive energy 7*)(3.87) 

+e)2|N|?[(m? + p?)P, p] (3.88) 

using (3.20) and (3.85) (see problem 3.1). What about the current for the 7? 


For free physical m~ particles of positive energy (m? + p?)!/? and momentum 
p we expect 


— 


Jb (77) = (—e)2INPl(m? + p°), p] (3.89) 
by simply changing the sign of the charge in (3.88). But it is evident that 
(3.89) may be written as 


Hu J= (+€)2|N[?[—(m? + p?)”?, —p] (3.90) 


which is just j4,(77) with negative 4-momentum. This suggests some equiv- 
alence between antiparticle solutions with positive 4-momentum and particle 
solutions with negative 4-momentum. 

Can we push this equivalence further? Consider what happens when a 
system A absorbs a 7? with positive 4-momentum p: its charge increases by 
+e, and its 4-momentum increases by p. Now suppose that A emits a physical 
m with 4-momentum k, where the energy k° is positive. Then the charge 
of A will increase by +e, and its 4-momentum will decrease by k. Now this 
increase in the charge of A could equally well be caused by the absorption 
of a mt — and indeed we can make the effect (as far as A is concerned) of 
the 7 emission process fully equivalent to a ++ absorption process if we say 
that the equivalent absorbed 7+ has negative 4-momentum, —k; in particular 
the equivalent absorbed 7+ has negative energy —k°. In this way, we view 
the emission of a physical “antiparticle? 1” with positive 4-momentum k as 
equivalent to the absorption of a ‘particle’ 7* with (unphysical) negative 4- 
momentum —k. Similar reasoning will apply to the absorption of a ro of 
positive 4-momentum, which is equivalent to the emission of a 7* of negative 
4-momentum. Thus we are led to the following hypothesis (due to Feynman): 


The emission (absorption) of an antiparticle of 4-momentum p* is physi- 
cally equivalent to the absorption (emission) of a particle of 4-momentum 
—ph. 
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(a) (0) 


FIGURE 3.2 
Coulomb scattering of a m~ by a static charge Ze illustrating the Feynman 
interpretation of negative 4-momentum states. 


In other words the unphysical negative 4-momentum solutions of the ‘particle’ 
equation do have a role to play: they can be used to describe physical processes 
involving positive 4-momentum antiparticles, if we reverse the role of ‘entry’ 
and ‘exit’ states. 

The idea is illustrated in figure 3.2, for the case of Coulomb scattering of a 
Tr” particle by a static charge Ze, which will be discussed later in section 8.1.3. 
By convention we are taking 7 to be the antiparticle. In the physical process 
of figure 3.2(a) the incoming physical antiparticle 7~ has 4-momentum pi, 
and the final 7” has 4-momentum pr: both E; and E; are, of course, positive. 
Figure 3.2(b) shows how the amplitude for the process can be calculated using 
7? solutions with negative 4-momentum. The initial state m~ of 4-momentum 
pi becomes a final state x? with 4-momentum —p;, and similarly the final state 
TT of 4-momentum pf becomes an initial state m+ of 4-momentum —pr. Note 
that in this and similar figures, the sense of the arrows always indicates the 
‘flow’ of 4-momentum, positive 4-momentum corresponding to forward flow. 

It is clear that the basic physical idea here is not limited to bosons. But 
there is a difference between the KG and Dirac cases in that the Dirac equation 
was explicitly designed to yield a probability density (and probability current 
density) which was independent of the sign of the energy: 


p=viy ja. (3.91) 
Thus for any solutions of the form 
w =wb(zx, t) (3.92) 
we have 
p = wlu|o(a, t)|? (3.93) 
and 


j =wlaw|o(ax, t)|? (3.94) 
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and p > 0 always. We nevertheless want to set up a correspondence so that 
positive-energy solutions describe electrons (taken to be the ‘particle’, by con- 
vention, in this case) and negative-energy solutions describe positrons, if we 
reverse the sense of incoming and outgoing waves. For the KG case this 
was straightforward, since the probability current was proportional to the 
4-momentum: 


J (KG) ~ p”. (3.95) 


We were therefore able to set up the correspondence for the electromagnetic 
current of 7+ and 77: 


at: je, ~ ep” positive energy m” (3.96) 
nm: jh, ~ (—e)p" positive energy 7 (3.97) 
= (+e)(—p") negative energy T”. (3.98) 


This simple connection does not hold for the Dirac case since p > 0 for 
both signs of the energy. It is still possible to set up the correspondence, 
but now an extra minus sign must be inserted ‘by hand’ whenever we have a 
negative-energy fermion in the final state. We shall make use of this rule in 
section 8.2.4. We therefore state the Feynman hypothesis for fermions: 


The invariant amplitude for the emission (absorption) of an antifermion 
of 4-momentum p” and spin projection s, in the rest frame is equal to 
the amplitude (minus the amplitude) for the absorption (emission) of a 
fermion of 4-momentum —p* and spin projection —s, in the rest frame. 


As we shall see in chapters 5-7, the Feynman interpretation of the negative- 
energy solutions is naturally embodied in the field theory formalism. 


re a 


3.5 Inclusion of electromagnetic interactions via the 
gauge principle: the Dirac prediction of g = 2 
for the electron 
Having set up the relativistic spin-0 and spin-4 free-particle wave equations, 
we are now in a position to use the machinery developed in chapter 2, in 


order to include electromagnetic interactions. All we have to do is make the 


replacement 
OK => DY = OH + igA" (3.99) 


for a particle of charge q. For the spin-0 KG equation (3.10) we obtain, after 
some rearrangement (problem 3.9), 


(D+m2)p = —ig(0,A" + AWO,)6 + PA? (3.100) 
= —Vkas. (3.101) 
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Note that the potential Vka contains the differential operator O,,; the sign of 
Vka is a convention chosen so as to maintain the same relative sign between 
V? and V as in the Schrödinger equation — for example that in (A.5). 

For the Dirac equation the replacement (3.99) leads to 

Ow a 0 

rá [a - (—iV — A) + Bm +qA yY (3.102) 
where A“ = (40, A). The potential due to A” is therefore Vp = gA°1—qa: A, 
which is a 4 x 4 matrix acting on the Dirac spinor. 

The non-relativistic limit of (3.102) is of great importance, both physically 
and historically. It was, of course, first obtained by Dirac; and it provided, 
in 1928, a sensational explanation of why the g-factor of the electron had the 
value g = 2, which was then the empirical value, without any theoretical basis. 

By way of background, recall from appendix A that the Schródinger equa- 
tion for a non-relativistic spinless particle of charge q in a magnetic field B 
described by a vector potential A such that B = V x A is 


ike 
Ot 


1 q = q 2 
-— V%p — =B- Ly + —A2y =i 
2m 2m 


2m 


(3.103) 


Taking B along the z-axis, the B- L term will cause the usual splitting (into 
states of different magnetic quantum number) of the (21 + 1)-fold degeneracy 
associated with a state of definite l. In particular, though, there should be no 
splitting of the hydrogen ground state which has | = 0. But experimentally 
splitting into two levels is observed, indicating a two-fold degeneracy and thus 
(see earlier) a j = 3-like degree of freedom. 

Uhlenbeck and Goudsmit (1925) suggested that the doubling of the hy- 
drogen ground state could be explained if the electron were given an addi- 
tional quantum number corresponding to an angular-momentum-like observ- 
able, having magnitude j = 3. The operators S = lo which we have already 
met serve to represent such a spin angular momentum. If the contribution to 
the energy operator of the particle due to its spin S enters into the effective 
Schródinger equation in exactly the same way as that due to its orbital an- 
gular momentum, then we would expect an additional term on the left-hand 
side of (3.103) of the form 

_ 18.5. (3.104) 
2m 
The corresponding wavefunction must now have two (spinor) components, 
acted on by the 2 x 2 matrices in S. 

The energy difference between the two levels with eigenvalues S, = +4 
would then be qB/2m in magnitude. Experimentally the splitting was found 
to be just twice this value. Thus empirically the term (3.104) was modified to 


=p Bs Si (3.105) 


2m 


where g is the ‘gyromagnetic ratio’ of the particle, with g ~ 2. Let us now see 
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how Dirac deduced the term (3.105), with the precise value g = 2, from his 
equation. 

To achieve a non-relativistic limit, we expect that we have somehow to 
reduce the four-component Dirac equation to one involving just two compo- 
nents, since the desired term (3.105) is only a 2 x 2 matrix. Looking at the 
explicit form (3.72) for the free-particle positive-energy solutions, we see that 
the lower two components are of order v (i.e. v/c with c = 1) times the upper 
two. This suggests that, to get a non-relativistic limit, we should regard the 
lower two components of w as being small (at least in the specific representa- 
tion we are using for a and 5). However, since (3.102) includes the AP-field, 
this will have to be demonstrated (see (3.112)). Also, if we write the total 
energy operator as m + Fy, we expect H, to be the non-relativistic energy 


operator. 
Ww 
y= ( (3.106) 


We let 
where Y and Ẹ are not free-particle solutions, and they carry the space-time 
dependence as well as the spinor character (each has two components). We 
set a 
Hy =a-(¡V-q4)+f8m+q4 -m (3.107) 


where a 4 x 4 unit matrix multiplying the last two terms is understood. Then 


ENM E 


- am (9) +00 (9). (3.108) 


Multiplying out (3.108), we obtain 
Aw = 0-(¡V-qA)J0+q4%V (3.109) 
Ad = o-(-iV —qA)V + qA°®— 2me. (3.110) 
From (3.110), we obtain 
(Ñ — qA? +2m)® =a - (-iV — qA. (3.111) 


So, if Hy (or rather any matrix element of it) is < m and if A? is positive or, 
if negative, much less in magnitude than m/e, we can deduce 


® ~ (velocity) x Y (3.112) 


as in the free case, provided that the magnetic energy ~ o - A is not of order 

m. Further, if Hı < m and the conditions on the fields are met, we can drop 

H and qA0 on the left-hand side of (3.111), as a first approximation, so that 
-(-iV —qA 

y AAN (3.113) 


2m 
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Hence, in (3.109), 
a 1 
AVR Lo (iv — qA) Y + qA0V. (3.114) 
m 


The right-hand side of (3.114) should therefore be the non-relativistic energy 
operator for a spin-3 particle of charge q and mass m in a field A”. 

Consider then the case A% = 0 which is sufficient for the discussion of g. 
We need to evaluate 

lo -(-iV — qA) PT. (3.115) 

This requires care, because although it is true that (for example) (ø -p)? = p? 
if p = (Px, Py, pz) are ordinary numbers which commute with each other, 
the components of ‘-iV — qA’ do not commute due to the presence of the 
differential operator V, and the fact that A depends on r. In problem 3.10 
it is shown that 


lo - (-iV — qA)}?U = (-iV — qA)?U —qo- BV. (3.116) 


The first term on the right-hand side of (3.116) when inserted into (3.114), 
gives precisely the spin-0 non-relativistic Hamiltonian appearing on the left- 
hand side of (3.103) (see appendix A), while the second term in (3.116) yields 
exactly (3.105) with g = 2, recalling that S = 40. Thus the non-relativistic 
reduction of the Dirac equation leads to the prediction g = 2 for a spin-4 
particle. 

In actual fact, the measured g-factor of the electron (and muon) is slightly 
greater than this value: gexp = 2(1+ a). The ‘anomaly’ a, which is of order 
1073 in size, is measured with quite extraordinary precision (see section 11.7) 
for both the e” and et. This small correction can also be computed with 
equally extraordinary accuracy, using the full theory of QED, as we shall 
briefly explain in chapter 11. The agreement between theory and experiment is 
phenomenal and is one example of such agreement exhibited by our ‘paradigm 
theory’. 

It may be worth noting that spin-4 hadrons, such as the proton, have g- 
factors very different from the Dirac prediction. This is because they are, as 
we know, composite objects and are thus (in this respect) more like atoms in 
nuclei than ‘elementary particles’. 


ee 
Problems 
3.1 


(a) In natural units A = c = 1 and with 2m = 1, the Schrödinger 
equation may be written as 


14 + Vy — idy/dt = 0. 
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Multiply this equation from the left by 4* and multiply the complex 
conjugate of this equation by y (assume V is real). Subtract the 
two equations and show that your answer may be written in the 
form of a continuity equation 


Op/Ot+V- 7 =0 
where p =Y*Y and j =i 1|p*(Vy) — (VyY*)y]. 


(b) Perform the same operations for the Klein-Gordon equation and 
derive the corresponding ‘probability’ density current. Show also 
that for a free-particle solution 


b= Ne ip? 
with p” = (E, p), the probability current j” = (p, j) is proportional 
to p”. 
3.2 
(a) Prove the following properties of the matrices a; and £: 


(i) a; and 8 (i = 1,2,3) are all Hermitian [Hint: what is the 
Hamiltonian’). 

(ii) Tra; = Tr = 0 where ‘Ty’ means the trace, i.e. the sum of 
the diagonal elements [Hint: use Tr(AB) = Tr(BA) for any 
matrices A and B — and prove this too!]. 


(iii) The eigenvalues of a; and f are +1 [Hint: square a; and £]. 
(iv) The dimensionality of a; and f is even [Hint: the trace of a 
matrix is equal to the sum of its eigenvalues]. 


(b) Verify explicitly that the matrices a and 8 of (3.31), and of (3.40), 
satisfy the Dirac conditions (3.34) — (3.36). 


3.3 For free-particle solutions of the Dirac equation 
wy = we IP? 


the four-component spinor w may be written in terms of the two-component 


spinors 
( ) i 
X 


19 /8t = (ia - V + 6m)y 


From the Dirac equation for y 


using the explicit forms for the Dirac matrices 


w=(25) Fal 2) 
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show that ¢ and x satisfy the coupled equations 


(E-m) = o- px 
(E+m)x = o-pd 


where p” = (E, p). 
3.4 


(a) Using the explicit forms for the 2 x 2 Pauli matrices, verify the 
commutation (square brackets) and anticommutation (braces) rela- 
tion [note the summation convention for repeated indices: €;;,.0% = 


Lp- €ijkok]: 
[o;,.0;] = Lies Oe {0i oj} = dul 
where €;; is the usual antisymmetric tensor 


—1 for an odd permutation of 1, 2, 3 


+1 for an even permutation of 1, 2, 3 
Eijk = | 
0 if two or more indices are the same, 


ðij is the usual Kronecker delta, and 1 is the 2 x 2 matrix. Hence 
show that 
0;0j = dig 1 + ic Ok. 


(b) Use this last identity to prove the result 
(a0-a)(o-b) =a-b1l+io-ax b. 


Using the explicit 2 x 2 form for 
o-p= Pz Px — 1Py 
Pa + Py — Pz 


(o -p) = p’°1. 


show that 


3.5 Verify the conservation equation (3.56). 


3.6 Check that h(p) as given by (3.66) does commute with a - p + Bm, the 
momentum-space free Dirac Hamiltonian. 


3.7 Let $ be an arbitrary two-component spinor, and let & be a unit vector. 


(a) Show that 3(1 +0 - &)¢ is an eigenstate of o: ú with eigenvalue 
+1. The operator 4(1 +ø - ù) is called a projector operator for 
the ø - ù = +1 eigenstate since when acting on any ¢ this is what 
it ‘projects out’. Write down a similar operator which projects out 


the ø - ú = —1 eigenstate. 
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(b) Construct two two-component spinors $, and ¢_ which are eigen- 
states of o-t belonging to eigenvalues +1, and normalized to gis = 
ors for (r,s) = (+, —), for the case 4 = (sin 0 cos ¢, sin @ sin p, cos 0) 
[Hint: take the arbitrary ¢ = (})]. 


3.8 Positive-energy spinors u(p, s) are defined by 
p? 
ulp, s) = (E+m)\/? o'p , s=1,2 
E+ me 


with ¢°'¢° = 1. Verify that these satisfy u'u = 2E. 
In a similar way, negative-energy spinors v(p, s) are defined by 
o-p a 
u(p,s) = (BE +m)? | E+m s=1,2 
x’ 
with xx? = 1. Verify that viv = 2E. 
3.9 Using the KG equation together with the replacement 0 > 0" + iqA”, 
find the form of the potential Vka in the corresponding equation 


(0 + m?)d = -Yke 
in terms of A”. 
3.10 Evaluate 
lo - (iV —gA)}° 
by following the subsequent steps (or doing it your own way): 
(a) Multiply the operator by itself to get 
((o.- AV? +iqlo - V)(o. A) +ig(o - A) V) +l- AJA Y. 


The first and last terms are, respectively, —V” and q? A? where the 
2 x 2 unit matrix 1 is understood. The second and third terms are 
ig(o - V)(o - Ay) and igl[o - A)(o- Vw). These may be simplified 
using the identity of problem 4.4(b), but we must be careful to treat 
V correctly as a differential operator. 

(b) Show that (o -V)(o - A) = V -(Ay) +io- {V x (Aw)}. Now use 
V x (Av) =(V x Ajy — A x Vy to simplify the last term. 

(c) Similarly, show that (o - A)(o -V)y = A- Vy +io - (A x Vy). 

(d) Hence verify (3.116). 


A 


Lorentz Transformations and Discrete 
Symmetries 


In this chapter we shall review various covariances (see appendix D) of the KG 
and Dirac equations, concentrating mainly on the latter. First, we consider 
Lorentz transformations (rotations and velocity transformations) and show 
how the scalar KG wavefunction and the 4-component Dirac spinor must 
transform in order that the respective equations be covariant under these 
transformations. Then we perform a similar task for the discrete transforma- 
tions of parity, charge conjugation and time reversal. The results enable us 
to construct ‘bilinear covariants’ having well-defined behaviour (scalar, pseu- 
doscalar, vector, etc.) under these transformations. This is essential for later 
work, for two reasons: first, we shall be able to do dynamical calculations in a 
way that is manifestly covariant under Lorentz transformations; and secondly 
we shall be ready to study physical problems in which the discrete transfor- 
mations are, or are not, actual symmetries of the real world, a topic to which 
we shall return in the second volume. 


ee 
4.1 Lorentz transformations 


4.1.1 The KG equation 


In order to ensure that the laws of physics are the same in all inertial frames, 
we require our relativistic wave equations to be covariant under Lorentz trans- 
formations — that is, they must have the same form in the two different frames 
(see appendix D). In the case of the KG equation 


(O + m*)d(x) = —q[0, A" (x) + A” (x)d,]0(a) + A? (x)(x) (4.1) 


for a particle of charge q in the field A”, this requirement is taken care of, 
almost automatically, by the notation. Consider a Lorentz transformation 
such that x > 2’. A” will transform by the usual 4-vector transformation 
law (i.e. like 24), which we write as A“(x) > A'”(x'). Similarly we write 
the transform of ¢ as p(1) > (a). Then in the primed coordinate frame 
physics must be described by the equation 


(mb) (a) = ial), A (a) + Ar), ]6"(@") + PAPE (a). (4-2) 
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Now the 4-dimensional dot products appearing in (4.2) are all invariant under 
the Lorentz transformation, so that (4.2) can be written as 


(0 +m*)9 (2) = —iqlOu A" (£) + A” (00, lo (2) + A? (2) (2), (4.3) 


and we see that the wavefunction in the primed frame may be identified (up 
to a phase) with that in the unprimed frame: 


'(2') = o(2). (4.4) 


Equation (4.4) is the condition for the KG equation to be covariant under 
Lorentz transformations. Since x’ is a known function of x=, given by the 
angles and velocities parametrizing the transformation, equation (4.4) enables 
one to construct the correct function g which the primed observers must use, 
in order to be consistent with the unprimed observers. 

By way of illustration, consider a rotation of the coordinate system by an 
angle a in a positive sense about the x-axis; then the position vector referred 
to the new system is a’ = (x', y’, z") where 


a! 1 0 0 x 
y |=|0 cosa sina y |, (4.5) 
zl 0 — sina cosa 2 


which we shall write as 


e =R de (4.6) 
Correspondingly, equation (4.4) is, in this case, 
$ (Ry(a) x) = (x), (4.7) 
which can also be written as 
9 (a) = 9(Rz' (a) £). (4.8) 


It is convenient to begin with an ‘infinitesimal rotation’, where the angle 
a in (4.5) is replaced by ex such that cose, ~ 1 and sine, Y €,. Then it is 
easy to verify that (4.5) becomes 


ve =R,(e,)"=2-EXL (4.9) 


where € = (€,,0,0). For a general infinitesimal rotation, we simply replace this 
e by a general one, (€x, €y, €). For such a rotation, condition (4.8) becomes 


Va) = lx +e x z). (4.10) 
Expanding the right hand side to first order in € we obtain 


p (æ) olx) + (ex x): Vo = G(x) +e- (xx V)ó 
= (l+ie-L)d(x) (4.11) 


where L is the vector angular momentum operator z x —iV. 
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The rule for finite rotations may be obtained from the infinitesimal form 

by using the result 
et = lim (1+ A/n)” (4.12) 

n—>00 

generalized to differential operators (the exponential of a matrix being un- 
derstood as the infinite series expA = 1+ A+ A? +... ). Let e = a/n, 
where a = (az, Qy, œz) are three real finite parameters; we may think of the 
direction of a as representing the axis of the rotation, and the magnitude of 
a as representing the angle of rotation. Then applying the transformation 
(4.11) n times, and letting n tend to infinity, we obtain for the finite rotation 


9 (1) = ¿LL ala) = Un (odo (a). (4.13) 


Note that Ur (a) is a unitary operator, since OL is the inverse rotation. 

Equation (4.13) is, of course, the familiar rule for rotations of scalar wave- 
functions, exhibiting the intimate connection between rotations and angular 
momentum in quantum mechanics. We recall that if a Hamiltonian is invari- 
ant under rotations, then the operators L commute with the Hamiltonian and 
angular momentum is conserved. 

A similar calculation may be done for velocity transformations (‘boosts’), 
leading to corresponding operators K - see problem 4.1. 


4.1.2 The Dirac equation 


The case of the Dirac equation is more complicated, because (unlike the KG œ) 
the wavefunction has more than one component, corresponding to the fact that 
it describes a spin-1/2 particle. There is, however, a direct connection between 
the angular momentum associated with a wavefunction, and the way that the 
wavefunction transforms under rotations of the coordinate system. To take a 
simple case, the 2p wavefunctions mentioned in section 3.2 correspond to l = 1 
on the one hand and, on the other, to the components of a vector — indeed the 
most basic vector of all, the position vector æ = (x,y,z) itself. If we rotate 
the coordinate system in the way represented by (4.5), the components in the 
primed system transform into simple linear combinations of the components 
in the original system. 

Very much the same thing happens in the case of spinor wavefunctions, 
except that they transform in a way different from — though closely related to 
— that of vectors. In the present section we shall discuss how this works for 
three-dimensional rotations of the spatial coordinate system, and explain how 
it generalizes to boosts, which include transformations of the time coordinate 
as well. It will be convenient to use the alternative representation (3.40) for 
the Dirac matrices. In this representation, the components ¢, x of the free- 
particle 4-spinor w of (3.43) satisfy 


Ep = 0-pó+mx (4.14) 
Ex = -0-px+m0 (4.15) 
rather than (3.45) and (3.46). 
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As before, we start with the infinitesimal rotation (4.9). Since p is a vector, 
it transforms in the same way as æ, so that under an infinitesimal rotation p 
becomes p’ where 

p' =p—exXp. (4.16) 

The question for us now is: how do the spinors ¢ and x transform under this 
same rotation of the coordinate system? 

The essential point is that in the new coordinate system the defining equa- 
tions (4.14) and (4.15) should take exactly the same form, namely 


EY = o-p'¢d +my’ (4.17) 

EX = -o-p'x’+m¢' (4.18) 
where ¢’ and x’ are the spinors in the new coordinate system, and we have 
used the fact that both E and m do not change under rotations. Our task is 
to find ¢’ and y’ in terms of ¢ and x. 


Since both ¢ and x are 2-component spinors, we might guess from (4.11) 
that the answer is 


$ =(1+io-e/2)¢, x =(1+io- €/2)x, (4.19) 


since the o /2 are the spin-1/2 matrices, taking the place of Ê. To check that 

this is, in fact, the correct transformation law, we proceed as follows.! First, 

multiply (4.14) from the left by the matrix (1+ io - €/2): then, since E and 
m commute with all matrices, the result is 

Ep = (l+io-ec/2)o-pp+my (4.20) 

(1 +i0 -€/2)0 - p(1 —io- €/2)¢' + my’ (4.21) 


where we have used 
(1 +10 -€/2)? ~ (1 — iø - €/2) (4.22) 


to first order in e. Keeping only first order terms in e, the first term on the 
right hand side of (4.21) is 


ll 1 
(o: pt zio-ea-p— 3 i0:po: eg. (4.23) 
This can be simplified using the result from problem 3.4(b): 
oc.ac:b=a-:b+i0-axb, (4.24) 


provided all the components of a and b commute. Applying (4.24), (4.23) 
becomes 


[optil ptio exp) -ile ptio: px oW (4.25) 
=(0-p—a-exp)d =a-p'd’. (4.26) 


1 We shall derive (4.19), and the corresponding rule for velocity transformations, equation 
(4.42) below, in appendix M of volume 2 using group theory. 
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Hence (4.21) is just 
EY =0-p Y +my’ (4.27) 


as required in (4.17). We can similarly check the correctness of the transfor- 
mation law (4.19) for x. 

The transformation rule for a finite rotation may be obtained from the 
infinitesimal form by using the result (4.12) applied to matrices. Then for a 
finite rotation we obtain the result 


g = explio:a/2) 9, x =explic - a/2) x. (4.28) 


We note that the behaviour of $ and x under rotations is the same: equation 
(4.28) is the way all 2-component spinors transform under rotations. 

By way of an illustration, consider the case of the finite rotation (4.5). 
Here a = (a,0,0), and the transformation matrix is 


1 
exp(io,a/2) = 1 + i0za/2+ 3(ioza/2) ho... (4.29) 


Multiplying out the terms in (4.29) and remembering that 0? = 1, we see that 
the transformation matrix is 


(4.30) 


cosa/2 + ia, sina/2 = ( cosa/2  isina/2 ) 


isina/2 cosa/2 


This means that the components ¢1,¢2 of the spinor y transform according 
to the rule 


pi = cosa/2 i +isina/2 fa (4.31) 
pp = isina/2 6, +cosa/2 da, (4.32) 


for this particular rotation. The transformed components are linear combina- 
tions of the original components, but it is the half-angle a/2 that enters, not 
a. 

Let us denote the finite transformation matrix by U, so that 


U = explio:a/2) and Ut =exp(—io - a/2). (4.33) 


It follows that 
UU’ =U'U =1, (4.34) 


since the rotation parametrized by —a clearly undoes the rotation parametrized 
by a. So U is a 2 x 2 unitary matrix. It follows that the normalization of 
@ and x is preserved under rotations: fig = td, and xx” = xix. The 
free-particle Dirac probability density p = yty = ¢'¢+ xÎx is therefore also 
(as we expect) invariant under rotations. 

More interestingly, we can examine the way the free-particle current den- 
sity 


j = yat = glog—xlox (4.35) 
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transforms under rotations. Of course, it should behave as a 3-vector, and 
this is checked in problem 4.2(a). 

We now turn to the behaviour of the spinors ¢ and x under boosts, which 
mix x and t, or equivalently p and E. For example, consider a Lorentz 
velocity transformation (boost) from a frame S to a frame S’ which is moving 
with speed u with respect to S along the common z-axis. Then the energy E 
and momentum px of a particle in S are transformed to E” and p’, in S' where 
(cf (D.1)) 


E! = cosh’ E—sinhY pz (4.36) 
pl, cosh Y pa — sinh V E, (4.37) 


where coshY = (1 — u2)-1/2 = y(u), and sinh% = y(uJu. As before, we 
start with an infinitesimal transformation, where Y is replaced by Ny such 
that coshn, = 1 and sinhn, = ma. Then (4.36) and (4.37) become E” = 
E — MPa, Dl; = Px — rE. For the general infinitesimal boost parametrized 
by N = (nz, My, Nz), the transformation law for (E, p) is 
E = E-n-p (4.38) 
p = p-nE. (4.39) 
Once again, we have to determine ¢’ and x” such that the transformed versions 
of (4.14) and (4.15) are 
Eo Pe = my (4.40) 
(E’+a-p')y’ = mg. (4.41) 
Note that this time E does transform, according to (4.38). 
The required ¢’ and x” are 


$'=(1-0-9/Dó, x=(1+0-9/2)x. (4.42) 


The spinors ¢ and x behaved the same under rotations, but they transform 
differently under boosts. There are two kinds of 2-component spinors, ¢-type 
and x-type, in the representation (3.40), which are distinguished by their 
behaviour under boosts. The group theory behind this will be explained in 
appendix M of volume 2. 

To verify the rule (4.42), take equation (4.14) in the form (4.40) and mul- 
tiply from the left by the matrix (1 + - 7/2), to obtain 


(1+0:n/2)(E-o0:p)p= my’, (4.43) 
or equivalently 
(140 -n/2)(E—o- p)(1+o-n/2)¢! = my’, (4.44) 


where we have used (l—o-/2)~1 ~ (14+0:n/2). For (4.44) to be consistent 
with (4.40) we require 


(1+0-9/2(E-0-p(1+0-9/2)=E'-0-.p'. (4.45) 
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Keeping only first order terms in 7, the left hand side of (4.45) is 


1 
B-o-ptEo-n-s(o-po-nto:no:p) (4.46) 
=E-n-:p-o0:(p-nE) (4.47) 
=E'-0-p (4.48) 


as required for the right hand side of (4.45). 

For a finite boost ¢ and x transform by the ‘exponentiation’ of (4.42), 
namely 

p = exp(—0 : 8/2) ¢, x’ = explo - 8/2) x (4.49) 

where the three real parameters 9 = (Vz, Vy, Vz) specify the direction and 
magnitude of the boost. In contrast to (4.28), the transformations (4.49) are 
not unitary. If we denote the matrix exp(—o - 9/2) by B, we have B = Bi 
rather than B7! = Bi. So B does not leave Și and xÎx invariant. Actually 
this is no surprise. We already know from section 4.1.2 that the density 
pi + xix ought to transform as the fourth component p of the 4-vector 
j” = (p,j). Let us check this for our infinitesimal boost: 


p = Gest 
= $(1-0-9/21d-0-9/D+x (1+0-9/2(1+0-9/2) x 
= got xly-dlod-ntxlox-n 
— (4.50) 


as required by (4.38). Similarly, it may be verified (problem 4.2(b)) that j 
transforms as the 3-vector part of the 4-vector j”, under this infinitesimal 
boost. 

On the other hand, the products ¢!y and xig are clearly invariant under 
the transformation (4.49), since the exponential factors cancel. This means 
that the quantity wi Bu is a Lorentz invariant. 

At this point it is beginning to be clear that a more ‘covariant-looking’ 
notation would be very desirable. In the case of the KG probability current, 
the 4-vector index u was clearly visible in the expression on the right-hand side 
of (3.20), but there is nothing similar in the Dirac case so far. In problem 4.3 
the four ‘y matrices’ are introduced, defined by y” = (y, y) with y° = 6 and 
y = Ba, together with the quantity y = 4pî0, in terms of which the Dirac 
p of (3.51) and j of (3.57) can be written as y(x)y} y(x) and v(1)yV(x) 
respectively. The complete Dirac 4-current is then 


j” =Y(0)y"p(z). (4.51) 


For free particle solutions, we (and problem 4.2) have established that j” 
of (4.51) indeed transforms as a 4-vector under infinitesimal rotations and 
boosts. We have also just seen that the quantity wy is an invariant. 

We end this section by illustrating the use of the finite boost transforma- 
tions (4.49). Consider two frames S and S”, such that in S a particle is at rest 
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with E = m,p = 0, and with spin up along the z-axis; in S’, the particle has 
energy E”, momentum p’ = (0,0, p”), and spin up along the z-axis. If we apply 
a boost such that S” has velocity (0,0,—v’) relative to S, where v’ = p'/E”, 
then E and p become 


E! = coshWE=my(v”) (4.52) 
p = sinhWE=mwy(v”) (4.53) 
as required. Now consider the forms of the 4-spinors in S and S’. In S, 


from (4.14) and (4.15) we have simply ¢ = x, and if we normalize such that 
uu = 2m we may take 


us= vi $t); =a) (4.54) 


In S” the spinor is 


b+ d+ 
coco Ja) om 


where the normalization N is determined (since tu is invariant) from the 
condition úsrus, = 2m to be N = (BE! + p’)'/?, giving 


_( (B+ 0)? o 
us: = ( (E — p')!/2 b+ ) 5 (4.56) 


But we can also calculate us, by applying the transformation (4.49) with 
tanh 9 = —v' to ug. Then the upper two components become 


$ = Vm py = ym 4, (4.57) 
while the lower two components become 
x =yme lo. (4.58) 


Now we can write 


E' i 1/2 
=<") (4.59) 


m 


e%/2 = (e%)1/2 = (cosh Y” + sinh W’)1/? = ( 


and 
PIN 1/2 
9 — (==) (4.60) 


and so we recover (4.56). 
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4.2 Discrete transformations: P, C and T 


The transformations we considered in section 4.1 are known as ‘continuous’, 
because the parameters involved (angles, speeds) vary continuously. This is 
essentially the reason we were able to build up finite transformations from 
infinitesimal ones, which differ only slightly from the identity transformation: 
finite transformations could be reached continuously from the identity. But 
there is another class of transformations, called ‘discrete’, which cannot be 
reached continuously from the identity. Examples of discrete transformations 
are parity (or space inversion), charge conjugation, and time reversal, and 
their combinations. Although these discrete transformations are important 
primarily in weak interactions, which we shall not cover until the second vol- 
ume, it is useful to discuss the behaviour of Dirac wavefunctions under discrete 
transformations at this stage. Among other things, more light will be cast on 
antiparticles. 


4.2.1 Parity 


The parity (or space inversion) transformation P is defined by 


P:a>a@'=-a, tt; (4.61) 


that is, P inverts the spatial coordinates. It follows that P also inverts mo- 
menta (p > —p) but does not change angular momenta (a x p > x x p) or 
spin (a > 0). We already see that there are two kinds of 3-vectors: polar 
3-vectors which change sign under P and axial vectors which do not. For ex- 
ample, the electric field E and the vector potential A are polar vectors, while 
the magnetic field B is an axial vector. There are also scalar quantities (such 
as x- p) which do not change sign under P, and pseudoscalar quantities (such 
as ø - p) which do. 

Consider first the KG equation (4.1). Since A is a polar vector, it changes 
sign under parity, as does V, while both 9/0t and A? remain the same. The 
scalar products „A“ and A*0,, are therefore invariant under parity, as are 
and A?. Hence we may identify ¿p(2”) = ¢(x), or equivalently 


p(x) = d(—a) = Poglz), (4.62) 


where Py is the coordinate inversion operator. Note that we are calling the 

transformed wavefunction pp rather than yet another ¢’ since we need to 

keep track of what transformation we are considering. If we take ¢(a) to be 

a positive-energy free particle solution with energy E and momentum p, dp 

will describe a positive energy particle with momentum —p, as we expect. 
Now let us study the covariance of the free particle Dirac equation 


O(a, t) 
Ot 


i 


= —ia- Vib(a,t) + Bmvlz, t) (4.63) 
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under P. Equation (4.63) will be covariant under (4.61) if we can find a 
wavefunction Wp(a',t) for observers using the transformed coordinate system 
such that their Dirac equation has exactly the same form in their system as 
(4.63): 


Ot (7, t) =—ia- V' wp (a, t) + Bmp (o, t). (4.64) 
Now we know that V’ = —V, since a” = —a. Hence (4.64) becomes 
A Owp / os if / 
im (zi) =ia- Vyp(a',t) + Bmiypp(a’,t). (4.65) 


Multiplying this equation from the left by P and using pa = —a we find 


Comparing (4.66) and (4.63), it follows that we may consistently translate 
between Y and wp using the relation 


w(a,t) = Bbe(-z,t), (4.67) 


or equivalently i 


Equation (4.68) is the required relation between the wavefunctions in the two 
systems; it may be compared to (4.4) and (4.62). 

In principle we could include an arbitrary phase factor mp on the right 
hand of (4.68) and (4.62); such a phase leaves the normalization of Y and Y, 
and all bilinears of the form Y (gamma matrix) y unaltered. The possibility 
of such a phase factor did not arise in the case of Lorentz transformations, 
since for infinitesimal ones the transformed 4” and the original y differ only 
infinitesimally (not by a finite phase factor). But the parity transformation 
cannot be built up out of infinitesimal steps — the coordinate system is either 
reflected or it is not. We will choose mp = 1. 

As an example of (4.68), consider the free particle solutions in the standard 
form (3.41), (3.72): 


v(a,t)=N ( a ) exp(—iEt+ ip: æ). (4.69) 
Bim? 
Then 
pplz, t) = Bu(—a,t)=N ( eS ) exp(—iEt — ip: a) (4.70) 
E+m $ 


which can be conveniently summarized by the simple statement that the three- 
momentum p as seen in the parity transformed system is minus that in the 
original one, as expected. Note that ø does not change sign. 
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It is also interesting to look at the behaviour of the spinors Y and x in the 
representation (3.40), where they satisfy the equations (4.14) and (4.15). Un- 
der parity p + —p, so we can immediately see that op = x and xp = ¢. Thus 
the 2-component spinors $ and x are (in this representation) interchanged un- 
der parity. 

The analysis leading to (4.68) may be extended to the case of the Dirac 
equation (3.102) for a particle of charge g in the field A“. As already noted, 
A is a polar vector, transforming under like æ or V; the scalar potential 4% is 
invariant under parity. The combination (—iV — q A) therefore changes sign 
under parity, and the manipulations following (4.65) proceed as before. 

We may introduce a corresponding parity operator P, which is unitary 
and acts on wavefunctions so as to change y into wp; then 


Py(z,t) = By(—w, t) = boyz, t), (4.71) 


so that 
P = BPo. (4.72) 


Applying P twice, we find 
Pola.) = V(x, t) (4.73) 


which implies that the eigenvalues of P are +1. 

For example, the positive energy rest-frame spinors ((3.73) with p = 0)) 
are eigenstates of P with eigenvalue +1, and the negative energy rest-frame 
spinors are eigenstates of P with eigenvalue —1. Such rest-frame eigenvalues 
of P are called intrinsic parities. The correspondence between negative energy 
solutions and antiparticles, discussed in the preceding section, then suggests 
that a fermion and its antiparticle have opposite intrinsic parity (note that 
the parity eigenvalue is multiplicative). We shall be able to derive this result 
after quantization of the Dirac field, in chapter 7. 

As usual in quantum mechanics, we may consider the action of P on oper- 
ators as well as wavefunctions. In particular, the parity transform of a Dirac 
Hamiltonian H(a) will be 


PA(x)Pt = pÊ Â (æ) Ê}. (4.74) 


If the Hamiltonian is invariant under parity, the right hand side of (4.74) will 
equal H and the operator P will commute with H ; the eigenvalue of P will 
then be conserved. The reader may easily check that the Hamiltonian for the 
charged particle in a field A” is parity invariant, using P AP} =—A. 

With the rule (4.68) in hand, we can examine how various bilinear covari- 
ants, such as wy or py”, transform under parity. For example, 


vp (a, tip (a, t) = pt (x, t)BBBY(a, t) = p(z, ty(z, t), (4.75) 
showing that wy is a scalar. Similarly, for a 4-vector 


v” (x,t) = (v°(a, t), v(x, t)) = y(x, t) ylz, t), (4.76) 
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the reader may check in problem 4.4(a) that v? is a scalar and v is a polar 
vector. 
More interesting possibilities emerge when we introduce a new y-matrix, 
5, defined by 
ys = HP yyy. (4.77) 
This matrix has the defining property that it anticommutes with the y” ma- 
trices: 


{75,7"} = 0. (4.78) 


Consider now the quantity p(x, t) = y(x,t)ysy(x,t). We find 
wp (a, t)yspp (E, t) = pi (a, t)Bys By (a, t) = —pb(x, t)p(a, t), (4.79) 


so that p(a,t) is a pseudoscalar. Similarly, the reader may verify in problem 
4.4(b) that the quantity a” (æ, t) = (a, t)ysy"(a, t) transforms under (in- 
finitesimal) rotations and boosts as a 4-vector, but that under parity a° (æ, t) 
is a pseudoscalar and a(z,t) is an axial vector. 

Matrix elements formed from v” and a” would have to be Lorentz invari- 
ant, of the form v„v”, apa”, or va”. For the first of these, we find (shortening 
the notation) 

vpu = vv? — (—v)- (—0) = u,v", (4.80) 
and similarly ap, a = apa”. Thus both of these matrix elements are scalars, 
taking the same form in both systems. However, this is not true of v,a”: 


vu = v0(—a0) — (—0) - (a) = —uya*, (4.81) 


showing that this quantity is a pseudoscalar, changing sign when we change 
systems. By itself, such a sign change would be irrelevant, since observables 
will depend on the modulus squared of the matrix element. If, however, the 
matrix element for a process has the form (v, — a, )(v” — a), for example, 
where both scalar and pseudoscalar parts are present, then the physics in one 
coordinate system and in the parity-transformed system will not be the same. 
One says ‘parity is violated’: only one of the systems can represent the real 
world; parity is conserved if physics in the two coordinate systems is the same. 

Lee and Yang (1956) were the first to point out that, while there was strong 
evidence for parity conservation in strong and electromagnetic interactions, its 
status in weak interactions was at that time untested. They proposed that a 
clear signal of parity violation could be found in weak decays from initially 
polarized states (i.e. < s >Æ 0): if the distribution of final state particles 
depends on odd powers of the cosine of the angle between the initial spin 
direction and the final momentum, then parity is violated (note that < s >-p 
is a pseudoscalar). The first experiment to demonstrate parity violation was 
performed by Wu et al. (1957), using the 6-decay of polarized Co. Lee and 
Yang (1956) also remarked that parity violation in the decay 


mt > pt + vu (4.82) 
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implies that the spin of the muon will be polarized along the direction of its 
momentum, and furthermore that the angular distribution of positrons in the 
subsequent decay 


ut —> et +D, + ve (4.83) 


would (as in the %Co experiment) serve as an analyser. This suggestion 
was quickly confirmed by Garwin et al. (1957) and by Friedman and Telegdi 
(1957); in the rest frame of the pion, the u* spin is aligned opposite to its 
momentum, a situation that would be reversed in the parity transformed 
frame. 

The end result of many years of research was to establish that the currents 
responsible for weak interactions of quarks and leptons have precisely the 
“uk — a! structure, leading to the observed parity violation (see volume 2). 


4.2.2 Charge conjugation 


Dirac’s hole theory led him to the remarkable prediction of the positron, and 
suggested a new kind of symmetry: to each charged spin-1/2 particle there 
must correspond an antiparticle with the opposite charge and the same mass. 
Feynman's interpretation of the negative energy solutions of the KG and Dirac 
equations assumes that this symmetry holds for both bosons and fermions. 
We now explore the idea of particle-antiparticle symmetry more formally. 

We begin with the KG equation for a spin-0 particle of mass m and charge 
q in an electromagnetic field A“, namely equation (4.1). Inspection of this 
equation shows at once that the wave function dc of a particle with the same 
mass and charge —q is related to the original wavefunction ¢ by 


pe = nc" (4.84) 


where pc is an arbitrary phase factor which we shall take to be unity. Equation 
(4.84) tells us how to connect the solutions of the particle (charge q) and 
antiparticle (charge —q) equations. When applied to free-particle solutions of 
the KG equation, the transformation (4.84) relates positive and negative 4- 
momentum solutions, as expected in the Feynman interpretation of the latter. 
We may extend the transformation (4.84) to a symmetry operation for the 
KG equation (4.1) if we introduce an operation which changes the sign of A“. 
Then the combined operation ‘take the complex conjugate of ¢ and change A” 
to — AF’ is a formal symmetry of (4.84), in the sense that the wavefunction ¢* 
in the field — A” satisfies exactly the same equation as does the wavefunction 
@ in the field A”. Of course, we have just seen that ¢* is the antiparticle 
wavefunction, so it is no surprise that the dynamics of the antiparticle in 
a field — A" is the same as that of the particle in a field A“. Still, this is 
symmetry of the KG equation, which we will call charge conjugation, denoted 

by C: 
C:¢3¢c=¢, A" => Ab =A". (4.85) 
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We can ask: how does the electromagnetic current behave under this trans- 
formation? The expression for the KG current is found by multiplying the 
free-particle probability current by the charge q, and by replacing 9 by the 
gauge-invariant operator D” = O! + 1q4*. This leads to 


Ika em(?: A”) ig[p*(0*" + igA")d — [(0" + igA")¢]* o} 
= ig[¢*d"d — (0"9*)9] — 29? A" H" Q. (4.86) 
The current for dc, AG is then 


jka em(?c, AG) igloe" bc — (9 06)oc] — 29 AE GEC 

= iglp d"o — (0"6)0] + 20746 O 
— KG em(?, A"). (4.87) 
As we would hope, the KG current changes sign under C. 


Now consider the Dirac equation for a particle of mass m and charge q in 
a field A“, which we write in the form 


Oy _ 

Ot 

We want to relate solutions of this equation to the solution We of the same 

equation with q replaced by —q. As in the KG case, we begin by writing down 
the complex conjugate equation, 

Ow* 

Ot 


(-a: V +iqa- A — ibm — iqA° yy. (4.88) 


= (—a,0' + a20? = ago? 
— iga10* +iq020* —iqazO” +i8m+igA*)y* (4.89) 


where we have used the fact that a, az and 8 are real and ag is pure imag- 
inary, which is the case in both the standard representation of the Dirac 
matrices, and the representation (3.40). Now imagine multiplying (4.89) from 
the left by a matrix c, with the properties that it commutes with a; and az, 
but anticommutes with az and 8. Then (4.89) will become 


Oy" 
Ot 


which is just (4.88) with q replaced by —q. So we may identify the charge- 
conjugate Dirac wavefunction as 


c = (-a-V —iga: A — ibm + iqA0) cy* (4.90) 


We = ce cy* (4.91) 
where nc is the usual arbitrary phase factor. The required c is 
c= Bas = y (4.92) 


as the reader may easily verify. It is customary to choose na = i, and so 
finally the connection between Ya and y is 


pelz) = Cov*(x), where Co = iy. (4.93) 
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Let us look at the effect of the transformation (4.93) on free-particle solu- 
tions of the Dirac equation. Referring to (3.73) we find that a positive energy 
spinor is transformed to 


ucts) = (B+ n| op pee 
E+m $ 
op s Sk 
= E 1/2 Fim (1020 ) 4.94 
(E +m) "ioa" ' (4.94) 
where we have used 05 = —02, 0201 = —0102 and 0203 = —0302. The 


4-spinor (4.94) is a negative energy solution v(p,s) as in (3.82), identifying 
—io2¢™ with xê. Accordingly we have shown that 


uc(p,s) = v(p, $). (4.95) 


Similarly, as the reader may check, 
volp, 8) = i770" (p, s) = ulp, s). (4.96) 


So from a positive energy free-particle spinor associated with 4-momentum p 
and spin s the transformation (4.93) produces a negative energy free-particle 
spinor associated with the same 4-momentum and spin, and vice versa: that 
is, u and v are charge-conjugate spinors. 

At this point we may wonder if it is possible to construct a self-conjugate 
4-spinor. Such a spinor would be appropriate for a fermionic particle which 
is the same as its antiparticle — that is, for a Majorana fermion, so named 
after Ettore Majorana who first raised this possibility (Majorana 1937). To 
pursue this idea, it is convenient to use the representation (3.40) for the Dirac 
matrices again, in order to keep track of the Lorentz transformation property 
of the Majorana spinor. Consider the 4-spinor 


ge ( E A ) (4.97) 


Then 


WMC = iy wry = ( SA ia ) ( a ) = ( ne ) = WM; (4.98) 


so that indeed wy, is self-conjugate. The Lorentz transformation property 
of wm is consistent, since we may easily show (problem 4.4(c)) that the 2- 
spinor 02¢* transforms as a x-type spinor. The reader can construct a similar 
self-conjugate 4-spinor using x rather than ¢. 

A self-conjugate fermion has to carry no distinguishing quantum number, 
such as electromagnetic charge. The only known neutral fermions are the neu- 
trinos, and until quite recently it was assumed that they are Dirac fermions, 
with distinct antiparticles (the relevant distinguishing quantum number being 
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lepton number). However, as we shall see in volume 2, owing to their very 
small mass, it is hard to discriminate between the two possibilities (Majorana 
and Dirac) for neutrinos, and a definitive answer will have to await the result 
of a crucial experiment, the search for neutrinoless double beta decay, which 
is only possible for Majorana neutrinos. 

Returning to more conventional matters, we extend (as in the KG case) 
the transformation (4.93) to a formal symmetry of the Dirac equation by 
including the sign change of A”, so that C for the Dirac equation is 


C:~ Voina, AX > —A". (4.99) 


We now examine how the electromagnetic current behaves under C in the 
Dirac case. The Dirac charge density is the probability density Yt% multiplied 
by the charge q, and the electromagnetic 3-current is the probability current 
wt aw multiplied by q: 


55 em = (av, av ap) = ab. (4.100) 


Consider the charge density: under the transformation (4.93) this becomes 


aibe = qu y ip" = qu a7BBazp* = qT". (4.101) 


In terms of the four components of y, the product YyTy* is yip? + popă + 
W3W3 + watz. These components are ordinary functions which commute with 
each other, so YTy* = yy = Wty; hence 


ave = qu'y (4.102) 


and the charge density does not change sign under C. Similarly, one finds that 
the electromagnetic 3-current does not change sign either. 

These results can be interpreted in the hole theory picture: the current 
due to a physical positive energy antiparticle of charge q and momentum p is 
regarded as the same as that of a missing negative energy particle of charge 
—q and momentum p. Our charge conjugation operation explicitly constructs 
the positive energy antiparticle wavefunction from the negative energy particle 
one. 

Yet this is not really what we want a true charge conjugation operator to 
do: which is, rather, to change a positive energy particle into a positive energy 
antiparticle. The same inadequacy was true in the KG case also. There is 
no way of representing such an operation in a single particle wavefunction 
formalism. The appropriate formalism is quantum field theory, in which y(x) 
becomes a quantum field operator (as do bosonic fields), and there is a unitary 
quantum field operator C with the required property. We shall see in chapter 
7 that fermionic operators anticommute with each other, and that this is just 
what is needed to ensure that the current changes sign under €. Bosonic 
fields, on the other hand, obey commutation rather than anticommutation 
relations, and this safeguards the change in sign of the bosonic current. 
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We have approached charge conjugation following the historical route, 
which is to say via the electromagnetic interaction. But we can ask whether 
(true) C is a good symmetry of other interactions, for example the weak 
interaction. Consider applying C to the reaction (4.82), so that it becomes 


T > u +D. (4.103) 


If C was a good symmetry, the (parity-violating) longitudinal polarization 
of the pu in (4.103) should be the same as that of the u* in (4.82). But 
in fact it is the opposite, the y” spin being aligned along the direction of 
its momentum. So C, like P, is violated in weak interactions. It is a good 
symmetry in electromagnetic and strong interactions. 


4.2.3 CP 


It has probably occurred to the reader that, although C and P are each 
violated in the decays (4.82) and (4.103), the combined transformation CP 
might be a good symmetry: particles are changed to antiparticles, the sense 
of longitudinal polarization is reversed, and the corresponding decays occur. 
Indeed, the rates for these two decays are the same, and CP is conserved. 
For a while, after 1956, it was hoped that CP would prove to be always 
conserved, so as to avoid a ‘lopsided’ distinction between right and left, and 
between matter and antimatter. But before long Christenson et al. (1964) 
reported evidence for CP violation in the decays of neutral K-mesons, a result 
soon confirmed by other experiments. 

As we mentioned in section 1.2.2, it was the difficulty of incorporating CP 
violation into the 2-generation electroweak theory that led Kobayashi and 
Maskawa (1973) to propose a third generation of quarks, which allowed a CP 
violating parameter to be included quite naturally. CP violation in K-decays 
is a small effect (of order one part in 10°), but in 1980 Carter and Sanda (1980) 
showed that considerably larger effects, up to 20%, could be expected in rare 
decays of neutral B mesons, according to the framework of Kobayashi and 
Maskawa (KM). Some 20 years later, the ‘B factories’ at the asymmetric e~ et 
colliders PEPII and KEKB began producing B mesons by the many millions, 
and intensive study of CP violation in the B°(db) — B°(db) systems followed 
at the BaBar and Belle detectors. Remarkably, all observations to date are 
consistent with the original KM parametrization. We shall return to this 
topic when we discuss weak interactions in volume 2, specifically in chapter 
21. Meanwhile we refer to Bettini (2008), chapter 8, for an introductory 
overview. 

It is worth pausing here to note the significance of CP violation. First 
of all, it implies that there is an absolute distinction between matter and 
antimatter and, as a consequence, between left and right: these are not merely 
a matter of convention. For example, the rate for the process 


BO > Kir (4.104) 
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is some 20% greater (Nakamura et al. 2010) than the rate for the CP- 
conjugate process 

BO > Kort. (4.105) 
(Note that the BO state is conventionally defined as the CP transform of the 
BO state). So the pion distinguished by being emitted in the higher-yielding 
reaction (4.104) defines ‘negatively charged’, and the polarization of the muon 
in its decay (4.103) defines what is a right-handed screw sense. 

Secondly, CP (and C) violation is one of the three conditions? established 
by Sakharov (1967) that would enable a universe containing initially equal 
amounts of matter and antimatter, when created in the Big Bang, to evolve 
into the matter-dominated universe we see today — rather than simply having 
the required imbalance as an initial condition. Within the Standard Model, 
all known CP violating effects are attributable to the KM mechanism. But 
calculations show (Huet and Sather 1995) that the matter-antimatter asym- 
metry generated from this source is very many orders of magnitude too small. 
This is, therefore, one area of physics where the Standard Model fails. 

Thirdly, CP violation is directly connected to the violation of another 
discrete symmetry, namely time reversal T, because very general principles of 
quantum field theory imply that the product CPT (in any order) is conserved 
-the CPT theorem. This theorem states (Liiders 1954, 1957, Pauli 1957) that 
CPT must be an exact symmetry for any Lorentz invariant quantum field 
theory constructed out of local fields, with a Hermitian Hamiltonian, and 
quantized according to the usual spin-statistics rule (integer spin particles are 
bosons, half-odd integer spin particles are fermions). Thus any violation of 
CP implies a violation of T if CPT is to be conserved. 

We shall return to CPT presently, but first let us deal with T. 


4.2.4 Time reversal 
The time reversal transformation T is defined by 
T:zoa =g, tot =-t; (4.106) 


that is, T reverses the direction of time. It follows that T reverses momenta 
(p > —p) and angular momenta (x x p + —a x p). Let us also note how 
the electromagnetic potentials transform under T: A? does not change, being 
generated by static charges, while A changes sign, since it is produced by 
currents; that is, 


AS) = A°(t) A(t’) = —A(t). (4.107) 


It follows that the electric field E does not change sign under T, but the 
magnetic field B does. It is easily checked that these prescriptions ensure 
that the Maxwell equations are covariant under T. 


2The other two are (a) the existence of baryon number violating transitions and (b) a 
time when the C, CP and baryon number violating transitions proceeded out of thermal 
equilibrium. 
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Consider first the behaviour of the KG equation for a particle of charge q 
in the field Ar: 


(D + m*)o(t) = —iglO,A"(t) + A” HILE) +A HE). (4.108) 


The equation in the time-reversed system is 


(A+ m*)or(t’) = —iql0,, AT) + Ar (1)0,Jóm(t) + Apert’). (4.109) 


Using (4.107) we obtain 
AR) = 0, AP (E), ARM), = —AP(1)0., AF) = A2(0). (4.110) 
It follows that we can identify 


or(t') = (t) (4.111) 


up to an arbitrary phase factor, here chosen to be unity. If g is a positive- 
energy free particle solution, ¢* represents a particle of positive energy in the 
time-reversed system, with momentum —p as expected. 
Now consider the behaviour under T of the Dirac equation for a particle 

of charge q in a field 44, 

w(t 

¡90 = {a -[-¡V — qA(t) + Bm + qA. (t) v(t) (4.112) 
where we have suppressed the spatial coordinate arguments. In the time- 
reversed system, the corresponding equation is 


Ove (t) 
o 


= (a: |-iV — qAw(t)] + Bm + qgAT t) jbr (t). (4.113) 


To relate Yr to w we start by taking the complex conjugate of (4.112) so as 
to obtain 


“iE — fa". iv — qAl)] + pm + qA) (4.114) 


which we may rewrite as 


ee) = la* : [iV + qAr(t’)] Se B*m Sh gAS.(t')}b* (E). (4.115) 


Now suppose a unitary matrix Ur exists such that 
Ura'Ui =-a,  UrB'Ui =; (4.116) 


then it is clear that the Dirac equation will be covariant under T with the 
identification 
prt) = Ur" (t). (4.117) 
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In either of the two representations of the Dirac matrices which we have been 
using, 41,03 and SP are real, while a2 is pure imaginary; it follows that Ur 
must commute with a and 8, and anticommute with a; and ag. A suitable 
Ur is 

Ur = 101 03 (4.118) 
where the phase is a conventional choice. 


Let us check what is the effect of the transformation (4.117) on a positive- 
energy plane wave solution (3.74). In the representation (3.31) Ur is given 


by 
02 0 
Ur = ( 0 o ) (4.119) 
and so 
enter) = (Em (E 2) E Dy exp (Et — ip- e) 
ETA 
1/2 020" en e aj 
= (E+m) op... |exp(—ift'+ip’-a), (4.120) 
D029 
which is a positive-energy solution with the expected momentum p' = —p, 


and with the transformed spinor wavefunction o2¢*. If we take @ to be a 
helicity eigenstate 
g-p 
|p| 


where A = +1, then it follows that 


Pa = Ady (4.121) 


p' 


T 7 o = Ax, (4.122) 


and the helicity is unchanged. 7 
As in the case of parity, we may introduce an operator T which changes 
$ to or for the KG equation, and w to wr for the Dirac equation. Then 


T(KG) = KT, (4.123) 


and _ | 
T(Dirac) = Uy KTo (4.124) 


where K is the complex conjugation operator, and To is the time coordinate 
reversal operator. The appearance of K is a general feature of time-reversal 
in quantum mechanics (Wigner 1964), and has important consequences.? Be- 
cause the transformations involve complex conjugation, the scalar product of 


3Complex conjugation also appeared in our discussion of C in section 4.2.2, but as 
indicated there the true operator C of quantum field is unitary. Even in quantum field 
theory, however, the time-reversal operator involves complex conjugation, as we shall see in 
section 7.5.3. 
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two wavefunctions < |W, > is not equal to the corresponding quantity 
< wep|Wir >, as it would be in the case of parity, for example, or for any 
other transformation represented by a unitary operator. Instead, we have 


< Vali >=< barbu >. (4.125) 


Note, however, that the probability | < w2|W1 > |? is still preserved. 
If we consider the matrix element of any operator Ô, then since Ou, is 
itself a wavefunction, we must have 


< 121Ó (1 >=< Y2|Óv1 >=< bor TÓd, >*=< bor TÓT par >* 
(4.126) 
where TOT! is the operator in the time-reversed system. In particular, if 
we take O to be a Hermitian interaction potential V, which is time-reversal 
invariant, then time-reversal invariance implies the relation 


< Va] Vp >=< bor Vlvir >*=< bir Vip >. (4.127) 


Now < |V| > is the amplitude for the state represented by Yı to make a 
transition to the state represented by w» to first order in the potential Y (see 
section M.3 of appendix M). Equation (4.127) therefore relates this amplitude 
to one for the inverse transition, involving time-reversed states. The relation in 
fact holds for the complete (all orders) transition operator T (see for example 
Lee 1981, section 13.5), and enables one to relate rates and cross sections for 
reactions and their inverses. 

For strong interactions, these relations are straightforward to test, and 
confirm that strong interactions are T-invariant. So are electromagnetic inter- 
actions. In weak interactions, where the violation of CP and the conservation 
of CPT implies that T is violated, it is generally very difficult if not impos- 
sible to set up the conditions for an inverse reaction to occur (consider the 
inverse of neutron decay, n — pe De, for example). However, one such test is 
possible in neutral K-decays (Kabir 1970). We can check whether the rate for 
a particle tagged at its production as a KO to decay in a way that identifies 
it as a KO is equal to the rate for a particle tagged as KO at its production 
to decay in a way that identifies it as a K°. The experiment (Angelopoulos 
et al. 1998) showed a T-violating difference in these rates. The parame- 
ters determining these reactions had actually been well determined by other 
measurements; still, this was an independent and direct demonstration of T 
violation. Evidence for T violation in B-meson transitions has been reported 
by Alvarez and Szynkman (2008), developing a test suggested by Banuls and 
Bernabeu (1999, 2000). 

We can also examine the behaviour of various bilinears under T. For ex- 
ample, the reader may easily check the results 


Prl yrl’) = pala), dele’ )ysbr(2') = ple) (e). (4.128) 


Time reversal symmetry will be violated if the theory contains both even and 
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odd amplitudes under T. An interesting example is provided by the amplitude 


ide (a)o"” 50 (2) Fo, (4.129) 
where | 
e Ce =A) (4.130) 


and where Fy is an external electric field with non-vanishing components 
Fo; = E*. In the representation (3.31), 


i i i 0 , 
oy = i ( 0 e ) mit, (4.131) 
and (4.129) reduces to _ 
dew(x) Uw (x) - E. (4.132) 


Problem 4.5 shows that the quantity (4.132) is odd under T, and it is easy 
to check that it is also odd under P. A non-zero value of such a term would 
correspond to an electric dipole moment for a spin-1/2 particle (compare the 
analogous quantity dmY(x)&y(x)- B for the magnetic dipole moment, which 
is even under P and T). Experiment places very strong limits on possible 
electric dipole moments (Nakamura et al. 2010) for the neutron, proton and 
electron: 


da < 0.29x 10% ecm (4.133) 
p < 0.54x 107” ecm (4.134) 
de = (0.069 0.074) x 10% e cm (4.135) 


Although these numbers seem tiny, calculations of the d, in the Standard 
Model produce a result some 6 or 7 orders of magnitude smaller than (4.133). 
However, these experimental limits impose strong constraints on theories 
which go beyond the Standard Model, and which may typically contain the 
possibility of larger T and CP violating effects. 


4.2.5 CPT 


We denote the product CPT by 0, and the corresponding operator by Ô. As 
already mentioned, for any conventional quantum field theory, and certainly 
for the Standard Model, the transformation @ is an invariance of the theory. 
One immediate consequence of this invariance is the equality of particle and 
antiparticle masses. This is easily demonstrated. Let |X, s, > be the state of 
a particle X at rest with z-component of spin equal to sz. The mass of X is 
given by the expectation value 


Mx =< X,s,|H|X,s, >, (4.136) 


where H is the total Hamiltonian. Clearly Mx is real, and independent of 
sz. Now the operator 0 involves T, and therefore we must be careful to use 
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(4.126) rather than the usual rule for unitary operators. So from (4.126) we 
have 


Mx =< X,s,|H|X,s, >"=< X, 5,10 060 ÂX, s; > (4.137) 


If the Hamiltonian is CPT invariant, then 6H6 = H. Also, we know 
the action of P,C and T on the states, from the previous results. Equation 
(4.137) then becomes 


Mx =< X,—s,|H|X, —s, >= Mx, (4.138) 


stating the equality of particle and antiparticle masses. The most sensitive 
test of (4.138) is provided by the K? — KO system, where the currently quoted 
limit for the mass difference is (Nakamura et al. 2010) 
yo _ ayo 
[e il < 8x 10-19 at 90% C.L. (4.139) 
Maverage 
O-invariance also implies that the charges of a charged particle and its 
antiparticle are equal in magnitude but opposite in sign, as are their magnetic 
moments; and in the case of unstable particles it implies that their lifetimes 
are equal, to first order in the interaction responsible for the decay (Lee 1981). 
All current data support these equalities (Nakamura et al. 2010). Other tests 
involve analysis of the implications of O-invariance as applied to transition 
amplitudes. As an example, we refer to a recent analysis of K-decays by 
Abouziad et al. (2011), both with and without the assumption of 0-invariance. 
The results were consistent with 0-invariance. 


E co ooo ————— ooo o ————— 


Problems 


4.1 Consider an infinitesimal boost along the x-axis, 


t = t-nz (4.140) 
ad = nt. (4.141) 
Show that the KG wavefunction transforms according to 

$ (2,1) = (1 + inka), (4.142) 


where 


K, > —i x 9/86 — i t 9/8. (4.143) 

Defining similar operators i K, for boosts in the y and z directions, show 
that Lk i 

[K,, Ky] = —-iL,. (4.144) 
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4.2 In this problem, use the representation (3.40) for the Dirac matrices, as 
in section 4.1.2. 


(a) Using the rule (4.19) for the transformation of the spinor ¢ under 
an infinitesimal rotation of the coordinate system, verify that ¿tag 
transforms as a 3-vector. [Hint: you need to show that Tap” = 
¢'o¢—ex pog; use the results of problem 3.4(a).] Show also that 
the free-particle Dirac probability current density is a 3-vector. 

(b) Using the rule (4.42) for the transformation of 4 and x under an 
infinitesimal boost, verify that j = ¢'o¢— xiox transforms as the 
3-vector part of the 4-vector (p,j). [Hint: you need to show that 
j =j-np| 


(a) Defining the four ‘y matrices’ 


AN) 


where y% = 6 and y = Ba, show that the Dirac equation can 
be written in the form (iy"0, — m)y = 0. Find the anticommu- 
tation relations of the y matrices. Show that the positive energy 
spinors u(p, s) satisfy (p — m)u(p,s) = 0, and that the negative 
energy spinors v(p, s) satisfy (p + m)vu(p, s) = 0, where p = “pu 
(pronounced ‘p-slash’). 
(b) Define the conjugate spinor 
w(x) = ya)" 

and use the previous result to find the equation satisfied by 1) in y 
matrix notation. 


(c) The Dirac probability current may be written as 
jr =b(2)yv(z). 
Show that it satisfies the conservation law 


9,5% =0. 


(a) Verify that, under P, ¢)(a, t)y°w(a, t) is a scalar, and that Y(æ, t)y~)(a, t) 
is a polar vector. 


(b) Verify that a” (x,t) = y(x, t)ysy y(x, t) transforms under infinites- 
imal rotations and boosts as a 4-vector; and that under P a? (æ) is 
a pseudoscalar, and a(x, t) is an axial vector. 

(c) Show that o2¢* transforms under rotations and boosts as a x-type 
spinor, and that o2x* transforms as a ¢-type spinor. 


4.5 Verify that (a, t)©w(a, t) - E of (4.132) is odd under T. 
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4.6 The Galilean transformation (non-relativistic boost) is defined by 
vz’ =—x-—vt, t' =t. 


Show that the free-particle time-dependent Schródinger equation is covariant 
under this transformation if the wavefunction transforms according to the rule 
Y (x,t) = explif (x, t)]w(ax, t), where f(x, t) satisfies the condition 

Of 1 i i 

-= -v Vf + iw: V = —(Vf)?-—V*f-—Vf-V. 

Ot weve am f) 2m f m f 
Find constants a and b such that the function f = at + b - x satisfies this 
condition. Show that the resulting transformation rule is consistent with the 
way you expect a plane wave solution to transform. 


Taylor & Francis 
Taylor & Francis Group 
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Introduction to Quantum 
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It was a wonderful world my father told me about. 

You might wonder what he got out of it all. I went to MIT. I went to 
Princeton. I went home and he said, ‘Now you’ve got a science education. I 
have always wanted to know something that I have never understood; and so, 
my son, I want you to explain it to me.’ I said yes. 

He said, ‘I understand that they say that light is emitted from an atom 
when it goes from one state to another, from an excited state to a state of 
lower energy.’ 

I said ‘That’s right.’ 

‘And light is a kind of particle, a photon I think they call it.’ 

‘Yes.’ 

‘So if the photon comes out of the atom when it goes from the excited to 
the lower state, the photon must have been in the atom in the excited state.’ 

I said, ‘Well, no.’ 

He said, ‘Well, how do you look at it so you can think of a particle photon 
coming out without it having been in there in the excited state?’ 

I thought a few minutes, and I said, ‘I’m sorry; I don’t know. I can't 
explain it to you.’ 

He was very disappointed after all these years and years trying to teach 
me something, that it came out with such poor results. 


—R. P. Feynman, The Physics Teacher, vol 7, No 6, September 1969 


All the fifty years of conscious brooding have brought me no closer to the 
answer to the question, ‘What are light quanta?’ Of course today every rascal 
thinks he knows the answer, but he is deluding himself. 


—A. Einstein (1951) 


Quoted in ‘Einstein’s research on the nature of light’ 
E. Wolf (1979), Optic News, vol 5, No 1, page 39. 


I never satisfy myself until I can make a mechanical model of a thing. If I can 
make a mechanical model I can understand it. As long as I cannot make a 
mechanical model all the way through I cannot understand; and that is why 
I cannot get the electromagnetic theory. 


—Sir William Thomson, Lord Kelvin, 1884 Notes of Lectures on Molecular 
Dynamics and the Wave Theory of Light delivered at the Johns Hopkins Uni- 
versity, Baltimore, stenographic report by A. S. Hathaway (Baltimore: Johns 
Hopkins University) Lecture XX, pp 270-1. 
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Quantum Field Theory I: The Free Scalar 
Field 


In this chapter we shall give an elementary introduction to quantum field 
theory, which is the established ‘language’ of the Standard Model of particle 
physics. Even so long after Maxwell’s theory of the (classical) electromagnetic 
field, the concept of a ‘disembodied’ field is not an easy one; and we are 
going to have to add the complications of quantum mechanics to it. In such a 
situation, it is helpful to have some physical model in mind. For most of us, as 
for Lord Kelvin, this still means a mechanical model. Thus in the following two 
sections we begin by considering a mechanical model for a quantum field. At 
the end, we shall — like Maxwell — throw away the ‘mechanism’ and have simply 
quantum field theory. Section 5.1 describes this programme qualitatively; 
section 5.2 presents a more complete formalism, for the simple case of a field 
whose quanta are massless, and move in only one spatial dimension. The 
appropriate generalizations for massive quanta in three dimensions are given 
in section 5.3. 


EE: SeSe 


5.1 The quantum field: (i) descriptive 


Mechanical systems are usefully characterized by the number of degrees of 
freedom they possess: thus a one-dimensional pendulum has one degree of 
freedom, two coupled one-dimensional pendulums have two degrees of free- 
dom — which may be taken to be their angular displacements, for example. A 
scalar field ¢(a, t) corresponds to a system with an infinite number of degrees 
of freedom, since at each continuously varying point z an independent ‘dis- 
placement’ (x,t), which also varies with time, has to be determined. Thus 
quantum field theory involves two major mathematical steps: the description 
of continuous systems (fields) which have infinitely many degrees of freedom, 
and the application of quantum theory to such systems. These two aspects are 
clearly separable. It is certainly easier to begin by considering systems with 
a discrete — but possibly very large — number of degrees of freedom, for ex- 
ample a solid. We shall treat such systems first classically and then quantum 
mechanically. Then, returning to the classical case, we shall allow the number 
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(a) (b) 


FIGURE 5.1 
A vibrating system with two degrees of freedom: (a) two mass points at rest, 
with the strings under tension; (b) a small transverse displacement. 


of degrees of freedom to become infinite, so that the system corresponds to a 
classical field. Finally, we shall apply quantum mechanics directly to fields. 
We begin by considering a rather small solid — one that has only two atoms 
free to move. The atoms, each of mass m, are connected by a string, and each 
is connected to a fixed support by a similar string (figure 5.1(a)); all the 
strings are under tension F. We consider small transverse vibrations of the 
atoms (figure 5.1(b)), and we call q,(t) (r = 1,2) the transverse displacements. 
We are interested in the total energy E of the system. According to classi- 
cal mechanics, this is equal to the sum of the kinetic energies má? of each 
atom, together with a potential energy V which can be calculated as follows. 
Referring to figure 5.1(b), when atom 1 is displaced by qi, it experiences a 
restoring force 
F, =Fsina—Fsin 8 (5.1) 


assuming a constant tension F along the string. For small displacements qi 
and q2 (i.e. q1,2 < l) we have 


sin a = q1/(12 + gy = q/l 
sin B = (q2 — q1)/[l? + (q2 — q)2]1/2 ~ (q2 — q1)/1 


where terms of order (qi,2/l)? and higher have been neglected. Thus the 
restoring force on particle 1 is, in this approximation, 


(5.2) 


Fy = k(2q = q2) (5.3) 
with k = F/l. Similarly, the restoring force on particle 2 is 
Fo = k(2q2 ar q1) (5.4) 


and the equations of motion are 


mă = —k(2q1 — q2) 
más = —k(2q2 — qı). (5.6) 
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The potential energy is then determined (up to an irrelevant constant) by the 
requirement that (5.5) and (5.6) are of the form 


mă = —0V/0n (5.7) 
mda NN —0V/0q2. (5.8) 

Thus we deduce that 
V = k(q + q2 — q1q@2). (5.9) 


Equations (5.5) and (5.6) form a pair of linear, coupled differential equa- 
tions. Each of the italicized words is important. By ‘linear’, is meant that only 
the first power of qı and q2 and their time derivatives appear in the equations 
of motion; terms such as q?, q1q2, G7, qi and so on would render the equa- 
tions of motion ‘nonlinear’. This linear/nonlinear distinction is a crucial one 
in dynamics. Most importantly, the solutions of linear differential equations 
may be added together with constant coefficients (‘linearly superposed’) to 
make new valid solutions of the equations. In contrast, solutions of nonlinear 
differential equations — besides being very hard to find! — cannot be linearly 
superposed to get new solutions. In addition, nonlinear dynamical equations 
may typically lead to chaotic motion. 

The notion of linearity /nonlinearity carries over also into the equations of 
motion for fields. In this context, an equation for a field ¢(a,t) is said to be 
linear if $ and its space — or time — derivatives appear only to the first power. 
As we shall see, this is true for Maxwell’s equations for the electromagnetic 
field and it is, of course, the mathematical reason behind all the physics of such 
things as interference and diffraction, which may be understood precisely in 
terms of superposition of solutions of these equations. Likewise the equations 
of quantum mechanics (e.g. Schrâdinger's equation) are all linear in this sense, 
consistent with the principle of superposition in quantum mechanics. 

It is clear, then, that in looking at simple mechanical models as a guide 
to the field systems in which we will ultimately be interested, we should con- 
sider ones in which the equations of motion are linear. In the present case, 
this is true, but only because we have made the approximation that qı and 
q2 are small (compared to 1). Referring to equation (5.2), we can imme- 
diately see that if we had kept the full expression for sina and sin 8, the 
resulting equations of motion would have been highly nonlinear. A similar 
‘small displacement’ approximation has to be made in determining the famil- 
iar wave equation, describing waves on continuous strings, for example (see 
(5.29) later). Most significantly, however, quantum mechanics is believed to 
be a linear theory without any approximation. 

The appearance of only linear terms in qı and q2 in the equations of mo- 
tion implies, via (5.7) and (5.8), that the potential energy can only involve 
quadratic powers of the q's, i.e. q?, q3 and qiqe, as in (5.9). Once again, had 
we used the general expression for the potential energy in a stretched string 
as ‘tensionxextension’ we would have obtained an expression containing all 
powers of the q’s via such terms as ([12 + q?]/2 — 1). 
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We turn now to the coupled aspect of (5.5) and (5.6). By this we mean 
that the right-hand side of the qı equation depends on q2 as well as qi, and 
similarly for the q2 equation. This ‘mathematical’ coupling has its origin in 
the term —kqiq2 in V, which corresponds to the ‘physical’ coupling of the 
string BC connecting the two atoms. If this coupling were absent, equa- 
tions (5.5) and (5.6) would describe two independent (uncoupled) harmonic 
oscillators, each of frequency (2k/m)!/2. When we consider the addition of 
more and more particles (see later) we certainly do not want them to vibrate 
independently, otherwise we would not be able to get wave-like displacements 
propagating through the system. So we need to retain at least this minimal 
kind of ‘quadratic’ coupling. 

With the coupling, the solutions of (5.5) and (5.6) are not quite so obvious. 
However, a simple step makes the equations much easier. Suppose we add the 
two equations so as to obtain 


m(qi + G2) = —k(q1 + q2) (5.10) 
and subtract them to obtain 
m(di — da) = —3k(q1 — q2). (5.11) 


A remarkable thing has happened: the two combinations qı + q2 and q1 — q2 
of the original coordinates satisfy uncoupled equations — which are of course 
very easy to solve. The combination qi + q2 oscillates with frequency wi = 
(k/m)*?, while qı — q2 oscillates with frequency wa = (3k /m)!/2. 

Let us introduce 


Qı = (a + q2)/V2 Q2 = (qı — q2)/v2 (5.12) 


(the /2's are for later convenience). Then the solutions of (5.10) and (5.11) 
are: 


Qi(t) = Acoswit+ Bsinwyt (5.13) 
Q(t) = Ccoswat+ Dsinwot. (5.14) 


Suppose that the initial conditions are such that 
qi(0) = q2(0) =a qi (0) = d2(0) =0 (5.15) 


i.e. the atoms are released from rest, at equal transverse displacements a. In 
terms of the Q,’s, the conditions (5.15) are 


Q2(0) = Q2(0) = 0 
Qi(0)=V2a  Qi(0)=0. 


Thus from (5.13) and (5.14) we find that the complete solution, for these 
initial conditions, is 


(5.16) 


Q(t) v2acosw1t (5.17) 
Qt) = 0. (5.18) 
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(a) (b) 


FIGURE 5.2 
Motion in the two normal modes: (a) frequency w1; (b) frequency we. 


We see from (5.18) that the motion is such that qi = q2 throughout, and from 
(5.17) that the system vibrates with a single definite frequency wi. A form 
of motion in which the system as a whole moves with a definite frequency 
is called a ‘normal mode’ or simply a ‘mode’ for short. Figure 5.2(a) shows 
two ‘snapshot’ configurations of our two-atom system when it is oscillating in 
the mode characterized by qi = q2. In this mode, only Q1(t) changes; Qa(t) 
is always zero. Another mode also exists in which qı = —q2 at all times: 
here Q1 (t) is zero and Qa(t) oscillates with frequency wa. Figure 5.2(b) shows 
two snapshots of the atoms when they are vibrating in this second mode. 
The coordinate combinations Q1, Q2, in terms of which this ‘single frequency 
motion’ occurs, are called ‘normal mode coordinates’ or ‘normal coordinates’ 
for short. 

In general, the initial conditions will not be such that the motion is a pure 
mode; both (21 (t) and Q2(t) will be non-zero. From (5.12) we have 


q(t) = [Qi(t) + Qa(t)]/V2 (5.19) 


and 
q(t) = [Qi(t) — Q2(0)]/v2 (5.20) 


so that qı and q2 are expressed as a sum of two terms oscillating with frequen- 
cies ww and wa. We say the system is in ‘a superposition of modes’. Never- 
theless, the mode idea is still very important as regards the total energy of 
the system, as we shall now see. The kinetic energy can be written in terms 
of the mode coordinates Q, as 


T= 4mO} + im? (5.21) 
while the potential energy V of (5.9) becomes 
V = 3mwjQi + 3mu3Q3 = V (Q1, Q2). (5.22) 
The total energy is therefore 


E = [$mQj + $mQ3] + [mw7 Qi + 4mw3Q3]. (5.23) 
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This equation shows that, when written in terms of the normal coordinates, 
the total energy contains no couplings terms of the form Q1Q2; indeed, the 
energy has the remarkable form of a simple sum of two independent uncoupled 
oscillators, one with characteristic frequency w1, the other with frequency wa. 
The energy (5.23) has exactly the form appropriate to a system of two non- 
interacting “things”, each executing simple harmonic motion: the “things” are 
actually the two modes. Modes do not interact, whereas the original atoms do! 
Of course, this decoupling in the expression for the total energy is reflected in 
the decoupling of the equations of motion for the Q variables: 


r=1,2. (5.24) 


It is most important to realize that the modes are non-interacting by virtue 
of the fact that we ignored higher than quadratic terms in V(q1,q2). Although 
the simple change of variables (q1,q2) > (Q1, Q2) of (5.12) does remove the 
qıq2 coupling, this would not be the case if, say, cubic terms in V were to 
be considered. Such higher order ‘anharmonic’ corrections would produce 
couplings between the modes — indeed, this will be the basis of the quantum 
field theory description of particle interactions (see the following chapter)! 

The system under discussion had just two degrees of freedom. We began 
by describing it in terms of the obvious degree of freedom, the physical dis- 
placements of the two atoms qı and q2. But we have learned that it is very 
illuminating to describe it in terms of the normal coordinate combinations 
Qı and Q2. The normal coordinates are really the relevant degrees of free- 
dom. Of course, for just two particles, the choice between the q,’s and the 
Q,’s may seem rather academic; but the important point — and the reason 
for going through these simple manipulations in detail — is that the basic idea 
of the normal mode, and of normal coordinates, generalizes immediately to 
the much less trivial N-atom problem (and also to the field problem). For N 
atoms there are (for one-dimensional displacements) N degrees of freedom, 
and if we take them to be the actual atomic displacements, the total energy 
will be 


N 
E= m +V( (q1,-.-,4r) (5.25) 


r=1 


which includes all the couplings between atoms. We assume, as before, that 
the q,'s are small enough so that only quadratic terms need to be kept in V (a 
constant is as usual irrelevant, and the linear terms vanish if the q,'s are the 
displacements from equilibrium). In this case, the equations of motion will be 
linear. By a linear transformation of the form (generalizing (5.12)) 


N 
Qr = 5 (rss (5.26) 
s=1 
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it is possible to write E as a sum of N separate terms, just as in (5.23): 


N 
E = > [3m0) + mwg. (5.27) 


r=1 


The Q,’s are the normal coordinates and the w,.’s are the normal frequencies, 
and there are N of them. If only one of the Q,’s is non-zero, the N atoms are 
moving in a single mode. The fact that the total energy in (5.27) is a sum of 
N single-mode energies allows us to say that our N-atom solid behaves as if 
it consisted of N separate and free harmonic oscillators — which, however, are 
not to be identified with the coordinates of the original atoms. Once again, 
and now much more crucially, it is the mode coordinates that are the relevant 
degrees of freedom rather than those of the original particles. 

The second stage in our programme is to treat such systems quantum 
mechanically, as we should certainly have to for a real solid. It is still true 
that — if the potential energy is a quadratic function of the displacements — 
the transformation (5.26) allows us to write the total energy as a sum of N 
mode energies, each of which has the form of a harmonic oscillator. Now, 
however, these oscillators obey the laws of quantum mechanics, so that each 
mode oscillator exists only in certain definite states, whose energy eigenvalues 
are quantized. For each mode of frequency w,, the allowed energy values are 


Er = (nr + $) fiw, (5.28) 


where n, is a positive integer or zero. This is in sharp contrast to the classical 
case, of course, in which arbitrary values are allowed for the oscillator energies. 
The total energy eigenvalue then has the form 


N 
E =X (n, + 5)hw. (5.29) 


The frequencies w, are determined by the interatomic forces and are common 
to both the classical and quantum descriptions; in quantum theory, though, 
the states of definite energy of the vibrating N-body system are characterized by 
the values of a set of integers (n1, n2,..., ny), which determine the energies 
of each mode oscillator. 

For each mode oscillator, hw, measures the quantum of vibrational energy; 
the energy of an allowed mode state is determined uniquely by the number n, 
of such quanta of energy in the state. We now make a profound reinterpre- 
tation of this result (first given, almost en passant by Born, Heisenberg and 
Jordan (Born et al. 1926) in one of the earliest papers on quantum mechan- 
ics). We forget about the original N degrees of freedom qi, q2, ..., qn and the 
original N ‘atoms’, which indeed are only remembered in (5.29) via the fact 
that there are N different mode frequencies w,. Instead we concentrate on 
the quanta and treat them as ‘things’ which really determine the behaviour 
of our quantum system. We say that ‘in a state with energy (ny + 4) hw, there 


122 5. Quantum Field Theory I: The Free Scalar Field 


are ny quanta present’. For the state characterized by (n1, n2,..., ny) there 
are nı quanta of mode 1 (frequency w1), na of mode 2,... and ny of mode N. 
Note particularly that although the number of modes N is fixed, the values of 
the n,’s are unrestricted, except insofar as the total energy is fixed. Thus we 
are moving from a ‘fixed number’ picture (N degrees of freedom) to a ‘vari- 
able number’ picture (the n,’s restricted only by the total energy constraint 
(5.29)). In the case of a real solid, these quanta of vibrational energy are 
called phonons. We summarize the point we have reached by the important 
statement that a phonon is an elementary quantum of vibrational excitation. 

Now we take one step backward in order, afterwards, to take two steps 
forward. We return to the classical mechanical model with N harmonically 
interacting degrees of freedom. It is possible to imagine increasing the num- 
ber N to infinity, and decreasing the interatomic spacing a to zero, in such a 
way that the product Na stays finite, say Na = /. We then have a classical 
continuous system — for example a string of length £. (We stay in one dimen- 
sion for simplicity.) The transverse vibrations of this string are now described 
by a field (x,t), where at each point x of the string d(x,t) measures the dis- 
placement from equilibrium, at the time t, of a small element of string around 
the point x. Thus we have passed from a system described by a discrete num- 
ber of degrees of freedom, q,(t) or Q,(t), to one described by a continuous 
degree of freedom, the displacement field ¢(a,t). The discrete suffix r has 
become the continuous argument x — and to prepare for later abstraction, we 
have denoted the displacement by (x,t) rather than, say, q(x, t). 

In the continuous problem the analogue of the small-displacement assump- 
tion, which limited the potential energy in the discrete case to quadratic pow- 
ers, implies that p(x,t) obeys the wave equation 


1 (x,t) elx, t) 

c2 ae —— ðr 
where c is the wave propagation velocity. Note that (5.30) is linear, but 
only by virtue of having made the small-displacement assumption. Again, we 
consider first the classical treatment of this system. Our aim is to find, for 
this continuous field problem, the analogue of the normal coordinates — or in 
physical terms, the modes of vibration — which were so helpful in the discrete 
case. Fortunately, the string’s modes are very familiar. By imposing suit- 
able boundary conditions at each end of the string, we determine the allowed 
wavelengths of waves travelling along the string. Suppose, for simplicity, that 
the string is stretched between x = 0 and x = @. This constrains d(x, t) to 
vanish at these end points. A suitable form for d(x,t) which does this is 


(5.30) 


$,(z,t) = A,(t) sin (>) (5.31) 
where r = 1,2,3,..., which expresses the fact that an exact number of half- 


wavelengths must fit onto the interval (0,2). Inserting (5.31) into (5.30), we 
find o 
Ar = —w? Ar (5.32) 
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FIGURE 5.3 
String motion in two normal modes: (a) r = 1 in equation (5.31); (b) r = 2. 


where 
wi = rr ee. (5.33) 


Thus the amplitude A,(t) of the particular waveform (5.31) executes simple 
harmonic motion with frequency wr. Each motion of the string which has a 
definite wavelength also has a definite frequency; it is therefore precisely a 
mode. Figure 5.3(a) shows two snapshots of the string when it is oscillating 
in the mode for which r = 1, and figure 5.3(b) shows the same for the mode 
r = 2; these may be compared with figures 5.2(a) and (b). Just as in the 
discrete case, the general motion of the string is a superposition of modes 


= Ant ) sin (+); (5.34) 


in short, a Fourier series! 

We must now examine the total energy of the vibrating string, which 
we expect to be greatly simplified by the use of the mode concept. The total 
energy is the continuous analogue of the discrete summation in (5.25), namely 


the integral 
2 2 
1 2(% 


where the first term is the kinetic energy and the second is the potential 
energy (p is the mass per unit length of the string, assumed constant). As 
noted earlier, the potential energy term arises from an approximation which 
limits it to the quadratic power. To relate this to the earlier discrete case, 
note that the derivative may be regarded as [p(x + dx) — p(1)]/9x as dx > 0, 
so that the square of the derivative involves the ‘nearest neighbour coupling’ 
olx + x)(x), analogous to the q,q2 term in (5.9). 

Inserting (5.34) into (5.35), and using the orthonormality of the sine func- 
tions on the interval (0,2), one obtains (problem 5.1) the crucial result 


co 


= (1/2) X 13047 + 3007421. (5.36) 


y=1 


Indeed, just as in the discrete case, the total energy of the string can be 
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written as a sum of individual mode energies. We note that the Fourier 
amplitude A, acts as a normal coordinate. Comparing (5.36) with (5.27), we 
see that the string behaves exactly like a system of independent uncoupled 
oscillators, the only difference being that now there are an infinite number 
of them, corresponding to the infinite number of degrees of freedom in the 
continuous field p(x,t). The normal coordinates A,(t) are, for many purposes, 
a much more relevant set of degrees of freedom than the original displacements 
olx, t). 

The final step is to apply quantum mechanics to this classical field sys- 
tem. Once again, the total energy is equivalent to that of a sum of (infinitely 
many) mode oscillators, each of which has to be quantized. The total energy 
eigenvalue has the form (5.29), except that now the sum extends to infinity: 


E= > le + 5)hw. (5.37) 


The excited states of the quantized field ¢(x, t) are characterized by saying 
how many phonons of each frequency are present; the ground state has no 
phonons at all. We remark that as / — oo, the mode sum in (5.36) or (5.37) 
will be replaced by an integral over a continuous frequency variable. 

We have now completed, in outline, the programme introduced earlier, 
ending up with the quantization of a ‘mechanical’ system. All of the forego- 
ing, it must be clearly emphasized, is absolutely basic to modern solid state 
physics. The essential idea — quantizing independent modes — can be ap- 
plied to an enormous variety of ‘oscillations’. In all cases the crucial concept 
is the elementary excitation — the mode quantum. Thus we have plasmons 
(quanta of plasma oscillations), magnons (magnetic oscillations), ..., as well 
as phonons (vibrational oscillations). All this is securely anchored in the 
physics of many-body systems. 

Now we come to the use of these ideas as an analogy, to help us understand 
the (presumably non-mechanical) quantum fields with which we shall actually 
be concerned in this book — for example the electromagnetic field. Consider a 
region of space containing electromagnetic fields. These fields obey (a three- 
dimensional version of) the wave equation (5.30), with c now standing for 
the speed of light. By imposing suitable boundary conditions, the total elec- 
tromagnetic energy in any region of space can be written as a sum of mode 
energies. Each mode has the form of an oscillator, whose amplitude is (see 
(5.31)) the Fourier component of the wave, for a given wavelength. These 
oscillators are all quantized. Their quanta are called photons. Thus, a photon 
is an elementary quantum of excitation of the electromagnetic field. 

So far the only kind of ‘particle’ we have in our relativistic quantum field 
theoretic world is the photon. What about the electron, say? Well, recalling 
Feynman again, ‘There is one lucky break, however — electrons behave just 
like light’. In other words, we shall also regard an electron as an elementary 
quantum of excitation of an ‘electron field’. What is ‘waving’ to supply the 


5.2. The quantum field: (îi) Lagrange-Hamilton formulation 125 


vibrations for this electron field? We do not answer this question just as we did 
not for the photon. We postulate a relativistic quantum field for the electron 
which obeys some suitable wave equation — in this case, for non-interacting 
electrons, the Dirac equation. The field is expanded as a sum of Fourier 
components, as with the electromagnetic field. Each component behaves as 
an independent oscillator degree of freedom (and there are, of course, an 
infinite number of them); the quanta of these oscillators are electrons. 

Actually this, though correctly expressing the basic idea, omits one crucial 
factor, which makes it almost fraudulently oversimplified. There is of course 
one very big difference between photons and electrons. The former are bosons 
and the latter are fermions; photons have spin angular momentum of one 
(in unit of h), electrons of one-half. It is very difficult, if not downright 
impossible, to construct any mechanical model at all which has fermionic 
excitations. Phonons have spin-1, in fact, corresponding to the three states of 
polarization of the corresponding vibrational waves. But ‘phonons’ carrying 
spin-4 are hard to come by. No matter, you may say, Maxwell has weaned 
us away from jelly, so we shall be grown up and boldly postulate the electron 
field as a basic thing. 

Certainly this is what we do. But we also know that fermionic particles, 
like electrons, have to obey an exclusion principle: no two identical fermions 
can have the same quantum numbers. In chapter 7, we shall learn how the 
idea sketched here must be modified for fields whose quanta are fermions. 


E a 


5.2 The quantum field: (ii) Lagrange-Hamilton 
formulation 


5.2.1 The action principle: Lagrangian particle mechanics 


We must now make the foregoing qualitative picture more mathematically 
precise. It is clear that we would like a formalism capable of treating, within 
a single overall framework, the mechanics of both fields and particles, in both 
classical and quantum aspects. Remarkably enough, such a framework does 
exist (and was developed long before quantum field theory): Hamilton's prin- 
ciple of least action, with the action defined in terms of a Lagrangian. We 
strongly recommend the reader with no prior acquaintance with this pro- 
found approach to physical laws read chapter 19 of volume 2 of Feynman's 
Lectures on Physics (Feynman 1964). 

The least action approach differs radically from the more familiar one 
which can conveniently be called ‘Newtonian’. Consider the simplest case, 
that of classical particle mechanics. In the Newtonian approach, equations 
of motion are postulated which involve forces as the essential physical input; 
from these, the trajectories of the particle can be calculated. In the least 


126 5. Quantum Field Theory I: The Free Scalar Field 


dt) qÀ 


Here ; 


FIGURE 5.4 
Possible space-time trajectories from ‘Here’ (q(t1)) to ‘There’ (q(t2)). 


action approach, equations of motion are not postulated as basic, and the 
primacy of forces yields to that of potentials. The path by which a particle 
actually travels is determined by the postulate (or principle) that it has to 
follow that particular path, out of infinitely many possible ones, for which a 
certain quantity — the action — is minimized. The action S is defined by 


s= f E EOE (5.38) 


where q(t) is the position of the particle as a function of time, q(t) is its 
velocity and the all-important function L is the Lagrangian. Given L as an 
explicit function of the variables q(t) and q(t), we can imagine evaluating S 
for all sorts of possible q(t)'s starting at time tı and ending at time t2. We 
can draw these different possible trajectories on a q versus t diagram as in 
figure 5.4. For each path we evaluate S: the actual path is the one for which 
S is smallest, by hypothesis. 

But what is L? In simple cases (as we shall verify later) L is just T — V, 
the difference of kinetic and potential energies. Thus for a single particle in a 
potential V 

L= mi? — V(x). (5.39) 


Knowing V(x), we can try and put the ‘action principle’ into action. How- 
ever, how can we set about finding which trajectory minimizes 5? It is quite 
interesting to play with some simple specific examples and actually calculate 
S for several ‘fictitious’ trajectories — i.e. ones that we know from the Newto- 
nian approach are not followed by the particle — and try and get a feeling for 
what the actual trajectory that minimizes S might be like (of course it is the 
Newtonian one — see problem 5.2). But clearly this is not a practical answer 
to the general problem of finding the q(t) that minimizes S. Actually, we can 
solve this problem by calculus. 
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Our problem is something like the familiar one of finding the point to at 
which a certain function f(t) has a stationary value. In the present case, 
however, the function S is not a simple function of t — rather it is a function 
of the entire set of points q(t). It is a function of the function q(t), or a 
‘functional’ of q(t). We want to know what particular “ge(t) minimizes S. 

By analogy with the single-variable case, we consider a small variation dq(t) 
in the path from q(t,) to q(t2). At the minimum, the change 6S corresponding 
to the change dq must vanish. This change in the action is given by 


OL 
ôS = [ Ea ) + =~ 64 0) dt. 5.40 
OO (5.40) 
Using 0q(t) = d(dq(t))/dt and integrating the second term by parts yields 
OL d OL OL as 
ôS = is dal ao — r dt + Frag t | i 5.41 
40 [oa aaa] ** Laem], 6% 


Since we are considering variations of path in which all trajectories start at tı 
and end at ta, 6q(t) = 6q(t2) = 0. So the condition that S be stationary is 


OL d OL 
6S = dal —— === | dt 0: 5.42 
4 at Ero al om 


Since this must be true for arbitrary dq(t), we must have 


OL d OL 


o pe 5.43 
dati) _ d da) ii 
This is the celebrated Euler-Lagrange equation of motion. Its solution gives 
the ‘qe(t) which the particle actually follows. 

We can see how this works for the simple case (5.39) where q is the coor- 
dinate z. We have immediately 


OL/0% = mă =p (5.44) 


and 


OL/0x = —OV/dx = F (5.45) 


where p and F are, respectively, the momentum and the force of the Newtonian 
approach. The Euler-Lagrange equation then reads 


F = dp/dt (5.46) 


precisely the Newtonian equation of motion. For the special case of a harmonic 
oscillator (obviously fundamental for the quantum field idea, as section 5.1 
should have made clear), we have 


L= má? — mu? (5.47) 
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which can be immediately generalized to N independent oscillators (see sec- 
tion 5.1) via 


N 
L= So (mQ? — 3mu2Q?). (5.48) 
r=1 


For many dynamical systems, the Lagrangian has the form ‘T — V’ indi- 
cated in (5.47) and (5.48). 

Our next step will be to replace classical particle mechanics by quantum 
particle mechanics. The standard way to do this is via the Hamiltonian formu- 
lation of classical mechanics, which we will now briefly review for the simple 
system with Lagrangian (5.39). In Hamiltonian dynamics, the variables used 
are not the Lagrangian ones of position x and velocity t, but rather the po- 
sition x and the canonical momentum p, where p is defined by 


_ OL 
T OE 


The place of the Lagrangian is taken by the Hamiltonian H(a,p) which is 
defined by 


(5.49) 


H(x,p) = pt — L. (5.50) 
Using (5.39) for L we find p = mă, and placing this result in (5.50) we obtain 


2 
H(z, p) =~ +V(z) (5.51) 
2m 
which in this case is just the total energy, expressed in terms of x and p. 
Instead of the Euler-Lagrange equation we have the Hamiltonian equations of 
motion, which are 


— = 4 5.52) 
and 9H 
Za = —p. 5.53) 
For the case (5.51) these equations yield 
p/m=« 5.54) 
and 
p=-—0V/0z. 5.55) 


Equation (5.54) is just the familiar relation of p to 2, and (5.55) is the New- 
tonian equation of motion. In the same way, the reader may check that the 
Hamiltonian for the assembly of oscillators described by the Lagrangian (5.48) 


is 
DE 


Q7) (5.56) 


where P, = mQ». 
With this in hand, we turn to quantum particle mechanics. 
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5.2.2 Quantum particle mechanics à la Heisenberg—Lagrange— 
Hamilton 


It seems likely that a particularly direct correspondence between the quantum 
and the classical cases will be obtained if we use the Heisenberg formulation 
(or ‘picture’) of quantum mechanics (see appendix I). In the Schródinger pic- 
ture, the dynamical variables such as position z are independent of time, 
and the time dependence is carried by the wavefunction. Thus we seem to 
have nothing like the q(t)'s. However, one can always do a unitary trans- 
formation to the Heisenberg picture, in which the wavefunction is fixed and 
the dynamical variables change with time. This is what we want in order to 
parallel the classical quantities q(t). But of course there is one fundamental 
difference between quantum mechanics and classical mechanics: in the former, 
the dynamical variables are operators which in general do not commute. In 
particular, the fundamental commutator states that (A = 1) 


(a(t), p(t)] =i (5.57) 


where ` indicates the operator character of the quantity. Here p is defined by 
the generalization of (5.44): > 
p = OL/04. (5.58) 


In this formulation of quantum mechanics we do not have the Schródinger-type 
equation of motion. Instead we have the Heisenberg equation of motion 


A=-i[4, A] (5.59) 


where the Hamiltonian operator H is defined in terms of the Lagrangian 
operator L by (cf (5.50)) 


Ñ = på- Ê (5.60) 
and A is any dynamical observable. For example, in the oscillator case 
Î = imë — mu? (5.61) 
p = má (5.62) 
and r A 
A = —P + mu? (5.63) 
2m 2 


which is the total energy operator. Note that p, obtained from the Lagrangian 
using (5.58), had better be consistent with the Heisenberg equation of motion 
for the operator Â = ĝ. The Heisenberg equation of motion for A = fp leads 
to 

p= —mw?¢ (5.64) 


which is an operator form of Newton’s law for the harmonic oscillator. Using 
the expression for p (5.62), we find 


= -w å. (5.65) 
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Now, although this looks like the familiar classical equation of motion 
for the position of the oscillator — and recovering it from the Lagrangian 
formalism is encouraging — we must be very careful to appreciate that this is 
an equation stating how an operator evolves with time. Where the quantum 
particle will actually be found is an entirely different matter. By sandwiching 
(5.65) between wavefunctions, we can at once see that the average position of 
the particle will follow the classical trajectory (remember that wavefunctions 
are independent of time in the Heisenberg formulation). But fluctuations 
about this trajectory will certainly occur: a quantum particle does not follow 
a ray-like classical trajectory. Come to think of it, neither does a photon! 

In the original formulations of quantum theory, such fluctuations were gen- 
erally taken to imply that the very notion of a ‘path’ was no longer a useful 
one. However, just as the differential equations satisfied by operators in the 
Heisenberg picture are quantum generalizations of Newtonian mechanics, so 
there is an analogous quantum generalization of the ‘path-contribution to the 
action’ approach to classical mechanics. The idea was first hinted at by Dirac 
(1933, 1981, section 32), but it was Feynman who worked it out completely. 
The book by Feynman and Hibbs (1965) presents a characteristically fasci- 
nating discussion — here we only wish to indicate the central idea. We ask: 
how does a particle get from the point q(t) at time tı to the point q(t2) at 
t2? Referring back to figure 5.4, in the classical case we imagined (infinitely) 
many possible paths q;(t), of which, however, only one was the actual path 
followed, namely the one we called qe(t) which minimized the action integral 
(5.38) as a functional of q(t). In the quantum case, however, we previously 
noted that a particle will no longer follow any definite path, because of quan- 
tum fluctuations. But rather than, as a consequence, throwing away the whole 
idea of a path, Feynman’s insight was to appreciate that the ‘opposite’ view- 
point is also possible: since unique paths are forbidden in quantum theory, we 
should in principle include all possible paths! In other words, we take all the 
trajectories on figure 5.4 as physically possible (together with all the other 
infinitely many ways of accomplishing the trip). 

However, surely not all paths are equally likely: after all, we must presum- 
ably recover the classical trajectory as î — 0, in some sense. Thus we must 
find an appropriate weighting for the paths. Feynman’s recipe is beautifully 
simple: weight each path by the factor 


en (5.66) 


where S is the action for that particular path. At first sight this is a rather 
strange proposal, since all paths — even the classical one — are weighted by a 
quantity which is of unit modulus. But of course contributions of the form 
(5.66) from all the paths have to be added coherently — just as we superposed 
the amplitudes in the ‘two-slit’ discussion in section 2.5. What distinguishes 
the classical path ge(t) is that it makes S stationary under small changes of 
path: thus in its vicinity paths have a strong tendency to add up construc- 
tively, while far from it the phase factors will tend to produce cancellations. 
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The amount a quantum particle can ‘stray’ from the classical path depends 
on the magnitude of the corresponding action relative to h, the quantum of 
action: the scale of coherence is set by h. 

In summary, then, the quantum mechanical amplitude to go from q(t1) to 
q(t2) is proportional to 


5 = (G 1 ORO) (5.67) 


all paths q(t ta 


There is an evident generalization to quantum field theory. We shall not, 
however, make use of the ‘path integral’ approach to quantum field theory in 
this volume. Its use was, in fact, decisive in obtaining the Feynman rules for 
non-Abelian gauge theories; and it is the only approach suitable for numerical 
studies of quantum field theories (how can operators be simulated numeri- 
cally?). Nevertheless, for a first introduction to quantum field theory, there 
is still much to be said for the traditional approach based on ‘quantizing the 
modes’, and this is the path we shall follow in the rest of this volume. Not the 
least of its advantages is that it contains the intuitively powerful ‘calculus’ of 
creation and annihilation operators, as we now describe. We shall return to 
the path integral formalism in chapter 16 of volume 2. 


5.2.3 Interlude: the quantum oscillator 


As we saw in section 5.1, we need to know the energy spectrum and associated 
states of a quantum harmonic oscillator. This is a standard problem, but there 
is one particular way of solving it — the ‘operator’ approach due to Dirac (1981, 
chapter 6) — that is so crucial to all subsequent development that we include 
a discussion here in the body of the text. 
For the oscillator Hamiltonian 
a l az, 1 222 
H = —p + =mu“d (5.68) 
2m 2 

if p and d were not operators, we could attempt to factorize the Hamiltonian 
in the form ‘(q + ip)(q — ip)’ (apart from the factors of 2m and w). In the 
quantum case, in which $ and q do not commute, it still turns out to be very 
helpful to introduce such combinations. If we define the operator 


1 i 
â = — | Vmwg + ——p 5.69 
i (v q =?) (5.69) 
and its Hermitian conjugate 
1 i 
at _ A A 
al = — mwg — —— 5.70 
a (v q ==?) (5.70) 


the Hamiltonian may be written as (see problem 5.4) 


H= lata + âât)w = (ata + byw. (5.71) 
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The second form for H may be obtained from the first using the commutation 


relation between â and at 
lâ, â] =1 (5.72) 


derived using the fundamental commutator between p and q. Using this ba- 
sic commutator (5.72) and our expression for H, (5.71), one can prove the 
relations (see problem 5.4) 


[H, â] = —wa 
N (5.73) 
[4,4] = wat. 
Consider now a state |n) which is an eigenstate of H with energy Ey: 


H|n) = E,|n). (5.74) 


Using this definition and the commutators (5.73), we can calculate the energy 
of the states (@'|n)) and (@|n)). We find 


A (ât In) = (E, +w)(a'|n)) (5.75) 
H(âln)) (En — w)(a/n)). (5.76) 


| 


Thus the operators ât and â respectively raise and lower the energy of |n) by 
one unit of w (fi = 1). Now since H ~ p? + @ with p and q Hermitian, we can 
prove that (||) is positive-definite for any state |4). Thus the operator â 
cannot lower the energy indefinitely: there must exist a lowest state |0) such 
that 

a|0) = 0. (5.77) 


This defines the lowest-energy state of the system; its energy is 
H|0) = 4w0) (5.78) 
the ‘zero-point energy’ of the quantum oscillator. The first excited state is 
11) = a"|0) (5.79) 


with energy (1 + 4)w. The nth state has energy (n + $)w and is proportional 
to (at)"|0). To obtain a normalization 


(nin) = 1 (5.80) 


the correct normalization factor can be shown to be (problem 5.4) 
In) = —(ât)"]0). (5.81) 


Returning to the eigenvalue equation for H, we have arrived at the result 


Hn) = (ata + 4)w|n) = (n + 4)w|n) (5.82) 
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so that the state |n) defined by (5.81) is an eigenstate of the number operator 
h = ala, with integer eigenvalue n: 


ñin) = nin). (5.83) 


It is straightforward to generalize all the foregoing to a system whose 
Lagrangian is a sum of N independent oscillators, as in (5.48): 


N 
È = Y (3má, — mw?) (5.84) 
r=1 
The required generalization of the basic commutation relations (5.57) is 
(Gr, Ds] = ids 
ne a (5.85) 
(dr, ds] = [Pr, Ps] = 0 


since the different oscillators labelled by the index r or s are all independent. 
The Hamiltonian is (cf (5.56)) 


N 

H = Y ((1/2m)p; + mw] (5.86) 
r=1 
N 

= Y (âlâ, + 4)wr (5.87) 
pl 


with á, and â] defined via the analogues of (5.69) and (5.70). Since the 
eigenvalues of each number operator îi = ala, are ny, by the previous results, 
the eigenvalues of H indeed have the form (5.29), 


N 
E = So (n, + Bu. (5.88) 


The corresponding eigenstates are products |n1)|n2)-:- |n) of N individ- 
ual oscillator eigenstates, where |n,) contains ny quanta of excitation, of fre- 
quency wr; the product state is usually abbreviated to |n1, n2,... nn). In the 
ground state of the system, each individual oscillator is unexcited: this state 
is |0,0,...,0), which is abbreviated to |0), where it is understood that 


a,|0) = 0 for all r. (5.89) 


The operators â} create oscillator quanta; the operators â destroy oscillator 
quanta. 


5.2.4 Lagrange—Hamilton classical field mechanics 


We now consider how to use the Lagrange-Hamilton approach for a field, 
starting again with the classical case and limiting ourselves to one dimension 
to start with. 
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FIGURE 5.5 
The passage from a large number of discrete degrees of freedom (mass points) 
to a continuous degree of freedom (field). 


As explained in the previous section, we shall have in mind the N > oo 
limit of the N degrees of freedom case 


{a-(t);r =1,2,...,N} wot, OCF: t) (5.90) 


where « is now a continuous variable labelling the displacement of the ‘string’ 
(to picture a concrete system, see figure 5.5). At each point x we have an 
independent degree of freedom (x, t) — thus the field system has a ‘continuous 
infinity’ of degrees of freedom. We now formulate everything in terms of a 
Lagrangian density £: 


S= J dt L (5.91) 
where (in one dimension) 


= fa L. (5.92) 


Equation (5.90) suggests that has dimension of [length], and since in the 
discrete case L = T — V, £ has dimension [energy/length]. (In general £ has 
dimension [energy/volume].) 
A new feature arises because ¢ is now a continuous function of x, so that 
L£ can depend on 0¢/0x as well as on ¢ and ¢ = 99/0t: L = L(¢, 0/02, p). 
As before, we postulate the same fundamental principle 


55 =0 (5.93) 


meaning that the dynamics of the field ¢ is governed by minimizing S. This 
time the total variation is given by 


6S = Ja] E m GILE (7) + Zso] aa (5.94) 


Integrating the 56 by parts in t, and the 5 (0¢/0x) by parts in x, and discarding 
the resulting ‘surface’ terms, we obtain 


ss= fas fasol - > 500/05) oe) ey 
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Since 6g is an arbitrary function, the requirement 6S = 0 yelds the Euler- 
Lagrange field equation 


OL oð OL ð (ƏL 
36 as u (3g) 0 ai 
The generalization to three dimensions is 
OL OL ð (ƏL 
—-=-W:|==s=|-=|=)=0. 5.97 
95 Y (aeway) îi (03) =“ 
As an example, consider 
_1 (d¢\? 1 „(989 


where the factor p (mass density) and c (a velocity) have been introduced to 
get the dimension of £ right. Inserting this into the Euler-Lagrangian field 
equation (5.96), we obtain 


2 2 
Pe 6 (5.99) 
or eoor 
which is precisely the wave equation (5.30) for the one-dimensional string, 
now obtained via the Euler-Lagrange field equations. Note that the Lagrange 
density £ has the expected form (cf (5.48)) of ‘kinetic energy density minus 
potential energy density’. 
For the final step — the passage to quantum mechanics for a field system 
— we shall be interested in the Hamiltonian (total energy) of the system, 
just as we were for the discrete case. Though we shall not actually use the 
Hamiltonian in the classical field case, we shall introduce it here, generalizing 
it to the quantum theory in the following section. We recall that Hamiltonian 
mechanics is formulated in terms of coordinate variables (‘q’) and momentum 
variables (‘p’), rather than the q and q of Lagrangian mechanics. In the 
continuum (field) case, the Hamiltonian H is written as the integral of a 
density H (we remain in one dimension) 


H= fon (5.100) 


while the coordinates q, (t) become the ‘coordinate field’ ¢(x, t). The question 
is what is the corresponding ‘momentum field’? 

The answer to this is provided by a continuum version of the generalized 
momentum derived from the Lagrangian approach (cf equation (5.44)) 


p = OL/04. (5.101) 


136 5. Quantum Field Theory I: The Free Scalar Field 


We define a ‘momentum field’ r(x,t) — technically called the ‘momentum 
canonically conjugate to ¢@’ — by 


n(x,t) = 0L/0p(x,t) (5.102) 


where £ is now the Lagrangian density. Note that 7 has dimensions of a 
momentum density. In the classical particle mechanics case we define the 
Hamiltonian by 

(p,q) = pq — L. (5.103) 


Here we define a Hamiltonian density H by 


H(p, T) = 1(a,t)o(a,t) — £L. (5.104) 
Let us see how all this works for the one-dimensional string with £ given 
by 
1 (0¢\? 1 „(9 
= -p| =] -+= —]. A 
Ep 50( 32) 22 \ ag (4103) 
We have 
(a, t) = p0d/dt (5.106) 
and 
Ho = LE E a e? oy i 
rn ee) Vor 
_ Live, aa fae 
= iu + pe (= (5.107) 
so that 


H, = [ ES Ea pe (ee t) ) dz. (5.108) 


This has exactly the form we expect (see (5.35)), thus verifying the plausibility 
of the above prescription. 

Inserting the mode expansion (5.34) into (5.92) and (5.105) we obtain the 
result (just as in (5.36) and problem 5.1) 


g 00 

£ 1 > 1 

Lp -| dz Ly = a > [Zoi — grata] A (5.109) 
0 r=1 


confirming that the system is equivalent to an infinite number of oscillators. 
The momentum canonically conjugate to A, is 


De Ed (5.110) 


Pr = DA, 2 
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and the Hamiltonian is 
H=% E + 2 pw A2. (5.111) 


We may cast (5.111) into nicer form by the change of variables 


P, = V2/0 pr, Qr=vV2/2 Ar, (5.112) 


in terms of which 


AP | 
i: a7 5 Perr (5.113) 
r=1 


just as in (5.56), with N —> oo. 


5.2.5 Heisenberg—Lagrange—Hamilton quantum field 
mechanics 


Finally, we are ready to quantize classical field formalism, and arrive at a 
quantum field mechanics — at least for the scalar field d(x,t). If we were 
dealing with the case in which (x,t) represented the displacement of a one- 
dimensional stretched string, quantization would be straightforward. We 
would take the classical Hamiltonian (5.113) and promote the mode coordi- 
nates Q, and their conjugate momenta P,. to operators satisfying commutation 
relations of the form (5.85). The rest of the analysis would be exactly as in 
equations (5.86) to (5.89), except that the number of modes N is infinite. But 
in the case of the general scalar field, we do not want to impose the boundary 
conditions (0,t) = p(£,t) = 0, which led to the mode expansion (5.34). It is 
then not so clear how to proceed. 

Fortunately, the Lagrange-Hamilton field formalism does indicate the way 
forward, which is one good reason for developing it in the first place. (Another 
is that it is very well suited to the analysis of symmetries, a crucial aspect 
of gauge theories — see chapter 7.) In the previous section we introduced the 
‘coordinate-like’ field ¢(a,t) and (via the Lagrangian) the ‘momentum-like’ 
field r(x,t). To pass to the quantized version of the field theory, we mimic 
the procedure followed in the discrete case and promote both the quantities @ 
and 7 to operators $ and 7, in the Heisenberg picture. As usual, the distinctive 
feature of quantum theory is the non-commutativity of certain basic quantities 
in the theory — for example, the fundamental commutator (h = 1) 


(â, (t), Bs (t)] = idrs (5.114) 


of the discrete case. Thus we expect that the operators $ and î will obey 
some commutation relation which is a continuum generalization of (5.114). 
The commutator will be of the form [d(2,t),#(y,t)], since — recalling fig- 
ure 5.5 — the discrete index r or s becomes the continuous variable x or y; we 
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also note that (5.114) is between operators at equal times. The continuum 
generalization of the 6,; symbol is the Dirac ô function, 6(a — y), with the 
properties 


7. adas 1 (5.115) 


CO 


JE 5a — y) f(a) da = f(y) (5.116) 


for all reasonable functions f (see appendix E). Thus the fundamental com- 
mutator of quantum field theory is taken to be 


[d(x, t), #(y,t)] = id — y) (5.117) 


in the one-dimensional case, with obvious generalization to the three-dimen- 
sional case via the symbol 6°(a — y). Remembering that we have set h = 1, 
it is straightforward to check that the dimensions are consistent on both 
sides. Variables $ and 7 obeying such a commutation relation are said to 
be ‘conjugate’ to each other. 

What about the commutator of two $'s or two 7’s? In the discrete case, 
two different ĝ’s (in the Heisenberg picture) will commute at equal times, 
lâ- (t), ds(t)] = 0, and so will two different p's. We therefore expect to supple- 
ment (5.117) with 


(A(z, t), d(y, t)] = [r(x, t), ry, t)] = 0. (5.118) 


Let us now proceed to explore the effect of these fundamental commutator 
assumptions, for the case of the Lagrangian density which yielded the wave 
equation via the Euler-Lagrange equations, namely 


2 2 „2 
Î, = 50 (5) = 500? (2) l (5.119) 
If we remove p, and set c = 1, we obtain 
1/06 = 4 ad : 
Î = ; (2) = (3) (5.120) 
for which the Euler-Lagrangian equation yields the field equation 
- a a =0. (5.121) 


We can think of (5.121) as a highly simplified (spin-0, one-dimensional) ver- 
sion of the wave equation satisfied by the electromagnetic potentials. We 
may guess, then, that the associated quanta are massless, as we shall soon 
confirm. 
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The Lagrangian density (5.120) is our prototype quantum field Lagrangian 
(one often slips into leaving out the word ‘density’). Applying the quantized 
version of (5.95) we then have 


#(0,t) = 2 = ba, t) (5.122) 
Dp(zx,t) 
and the Hamiltonian density is 
1 1/06 : 
n a p_1L.2,1[/0 
H=ñ0-L£ zi + 5 | $) : (5.123) 


The total Hamiltonian is 


ay 2 
A= fúa- î2 + (5) dz. (5.124) 


It is not immediately clear how to find the eigenvalues and eigenstates of 
the operator H. However, it is exactly at this point that all our preliminary 
work on normal modes comes into its own. If we can write the Hamiltonian as 
some kind of sum over independent oscillators — i.e. modes — we shall know how 
to proceed. For the classical string with fixed end points which was considered 
in section 5.1, the mode expansion was simply a Fourier expansion. In the 
present case, we want to allow the field to extend throughout all of space, 
without the periodicity imposed by fixed-end boundary conditions. In that 
case, the Fourier series is replaced by a Fourier integral, and standing waves 
are replaced by travelling waves. For the classical field obeying the wave 
equation (5.30) there are plane-wave solutions 


olx, t) x eriet (5.125) 


where (c = 1) 
w=k (5.126) 


which is just the dispersion relation of light in vacuo. The general field may 
be Fourier expanded in terms of these solutions: 


©% dk is ae 
x,t = alk eiks- iwt + a*(k e iketiwt 5.127 
ast) = | gb (Rett pa 
where we have required ¢ to be real. (The rather fussy factors (27V/2w)~+ 
are purely conventional, and determine the normalization of the expansion 
coefficients a, a* and â, ât later; in turn, the latter enter into the definition, 
and normalization, of the states — see (5.143)). Similarly, the ‘momentum 


field’ 7 = @ is expanded as 


iu dk : ikx—iwt * —ikx+iwt 
T =j ie A = (Kjer Hot, (5.128) 
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We quantize these mode expressions by promoting ¢ > $, 7 > îi and assum- 
ing the commutator (5.117). Thus we write 


și = dk A ika—iw A —ikzx+iw 
¿ó=/ pala eiet + at (pete (5.129) 


and similarly for î. The commutator (5.117) now determines the commutators 
of the mode operators â and ât: 


[a(k), at (k")] = 2m6(k — k’) 


(5.130) 
[a(k), a(k’)] = [ât (k), at (k’)] = 0 
as shown in problem 5.6. These are the desired continuum analogues of the 
discrete oscillator commutation relations 


[G,, at] = Ors 


(5.131) 
[âr, âs] = [at, âi] =0. 
The precise factor in front of the 6-function in (5.130) depends on the normal- 
ization choice made in the expansion of ¢, (5.129). Problem 5.6 also shows 
that the commutation relations (5.130) lead to (5.118) as expected. 

The form of the â, ât commutation relations (5.130) already suggests that 
the â(k) and ât (k) operators are precisely the single-quantum destruction and 
creation operators for the continuum problem. To verify this interpretation 
and find the eigenvalues of H, we now insert the expansion for $ and î into 
H ot (5.124). One finds the remarkable result (problem 5.7) 


d |. > (aaa) + ajat } . (5.132) 


Comparing this with the single-oscillator result 


H = 3(a'a+ ââi)u (5.133) 


shows that, as anticipated in section 5.1, each classical mode of the field can 
be quantized, and behaves like a separate oscillator coordinate, with its own 
frequency w = k. The operator ât (k) creates, and @(k) destroys, a quantum 
of the k mode. The factor (27)~! in H arises from our normalization choice. 

We note that in the field operator ¢ of (5.129), those terms which destroy 
quanta go with the factor ei*, while those which create quanta go with 
eti“t. This choice is deliberate and is consistent with the ‘absorption’ and 
“ivi of ordinary time-dependent perturbation theory in 


‘emission’ factors e” 
quantum mechanics (cf equation (A.33) of appendix A). 

What is the mass of these quanta? We know that their frequency w is 
related to their wavenumber k by (5.126), which — restoring h's and c's — can 
be regarded as equivalent to iw = hck, or E = cp, where we use the Einstein 
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and de Broglie relations. This is precisely the E—p relation appropriate to a 
massless particle, as expected. 

What is the energy spectrum? We expect the ground state to be deter- 
mined by the continuum analogue of 


G,\0) = 0 for all r; (5.134) 


namely 
a(k)|0) =0 for all k. (5.135) 


However, there is a problem with this. If we allow the Hamiltonian of (5.132) 
to act on |0) the result is not (as we would expect) zero, because of the 
a(k)a'(k) term (the other term does give zero by (5.135)). In the single 
oscillator case, we rewrote ââi in terms of átá by using the commutation 
relation (5.72), and this led to the ‘zero-point energy’, w, of the oscillator 


ground state. Adopting the same strategy here, we write H of (5.132) as 


> 
Qa 
= 
> 

+ 
Q 
= 
ay 


H= | —al(k)a(k)w+ | —=[a(k), at (k)]w. (5.136) 


Now consider H|0): we see from the definition of the vacuum (5.135) that the 
first term will give zero as expected — but the second term is infinite, since the 
commutation relation (5.130) produces the infinite quantity '9(0) as k > k’; 
moreover, the k integral diverges. 

This term is obviously the continuum analogue of the zero-point energy 4w 
— but because there are infinitely many oscillators, it is infinite. The conven- 
tional ploy is to argue that only energy differences, relative to a conveniently 
defined ground state, really matter — so that we may discard the infinite con- 
stant in (5.136). Then the ground state |0} has energy zero, by definition, and 
the eigenvalues of H are of the form 


J a (5.137) 


Qn 


where n(k) is the number of quanta (counted by the number operator ât (k)a(k)) 
of energy w = k. For each definite k, and hence w, the spectrum is like that of 
the simple harmonic oscillator. The process of going from (5.132) to (5.136) 

without the second term is called ‘normally ordering’ the á and ât operators: 

in a ‘normally ordered’ expression, all ât’s are to the left of all â's, with the 

result that the vacuum value of such expressions is by definition zero. 

It has to be admitted that the argument that only energy differences matter 
is false as far as gravity is concerned, which couples to all sources of energy. 
It would ultimately be desirable to have theories in which the vacuum energy 
came out finite from the start (as actually happens in ‘supersymmetric’ field 
theories — see for example Weinberg (1995), p 325); see also comment (3). 

We proceed on to the excited states. Any desired state in which excitation 
quanta are present can be formed by the appropriate application of ât (k) op- 
erators to the ground state |0). For example, a two-quantum state containing 
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one quantum of momentum kı and another of momentum kz may be written 
(cf (5.81)) 
lki, k2) oc â (ky )a" (k2)]0). (5.138) 


A general state will contain an arbitrary number of quanta. 

Once again, and this time more formally, we have completed the pro- 
gramme outlined in section 5.1, ending up with the ‘quantization’ of a classical 
field d(x,t), as exemplified in the basic expression (5.129), together with the 
interpretation of the operators â(k) and ât (k) as destruction and creation op- 
erators for mode quanta. We have, at least implicitly, still retained up to this 
point the ‘mechanical model’ of some material object oscillating — some kind 
of infinitely extended ‘jelly’. We now throw away the mechanical props and 
embrace the unadorned quantum field theory! We do not ask what is waving, 
we simply postulate a field — such as 4 — and quantize it. Its quanta of excita- 
tion are what we call particles — for example, photons in the electromagnetic 
case. 

We end this long section with some further remarks about the formalism, 
and the physical interpretation of our quantum field Ș. 


Comment (1) 


The alert reader, who has studied appendix I, may be worried about the 
following (possible) consistency problem. The fields ¢ and î are Heisenberg 
picture operators, and obey the equations of motion 


ó(0,1) = -iló(o,0), Â] (5.139) 


(x,t) = —i[r(=,t), H] (5.140) 


where H is given by (5.132). It is a good exercise to check (problem 5.8(a)) 


that (5.139) yields just the expected relation (x,t) = #(2x,t) (cf (5.122). 
Thus (5.140) becomes _ 

b(a, t) = -ilâ (x,t), A]. (5.141) 
However, we have assumed in our work here that ¢ obeyed the wave equation 


(cf.(5.121)) 


F 02 a 
ó=530(2,1) (5.142) 


as a consequence of the quantized version of the Euler-Lagrange equation (5.96). 
Thus the right-hand sides of (5.141) and (5.142) need to be the same, for con- 
sistency — and they are: see problem 5.8(b). Thus — at least in this case — 
the Heisenberg operator equations of motion are consistent with the Euler— 
Lagrange equations. 


Comment (2) 


Following on from this, we may note that this formalism encompasses both 
the wave and the particle aspects of matter and radiation. The former is evi- 
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dent from the plane-wave expansion functions in the expansion of ¢, (5.129), 
which in turn originate from the fact that ¢ obeys the wave equation (5.121). 
The latter follows from the discrete nature of the energy spectrum and the 
associated operators â, ât which refer to individual quanta i.e. particles. 


Comment (3) 


Next, we may ask: what is the meaning of the ground state |0) for a quantum 
field? It is undoubtedly the state with n(k) = 0 for all k, i.e. the state with 
no quanta in it — and hence no particles in it, on our new interpretation. It is 
therefore the vacuum! As we shall see later, this understanding of the vacuum 
as the ground state of a field system is fundamental to much of modern particle 
physics — for example, to quark confinement and to the generation of mass for 
the weak vector bosons. Note that although we discarded the overall (infinite) 
constant in H , differences in zero-point energies can be detected; for example, 
in the Casimir effect (Casimir 1948, Kitchener and Prosser 1957, Sparnaay 
1958, Lamoreaux 1997, 1998). These and other aspects of the quantum field 
theory vacuum are discussed in Aitchison (1985). 


Comment (4) 


Consider the two-particle state (5.138): |k1, k2) x a'(k1)a'(k2)|0). Since the 
ât operators commute, (5.130), this state is symmetric under the interchange 
kı O ko. This is an inevitable feature of the formalism as so far developed — 
there is no possible way of distinguishing one quantum of energy from another, 
and we expect the two-quantum state to be indifferent to the order in which 
the quanta are put in it. However, this has an important implication for 
the particle interpretation: since the state is symmetric under interchange 
of the particle labels kı and ka, it must describe identical bosons. How the 
formalism is modified in order to describe the antisymmetric states required 
for two fermionic quanta will be discussed in section 7.2. 


Comment (5) 


Finally, the reader may well wonder how to connect the quantum field theory 
formalism to ordinary ‘wavefunction’ quantum mechanics. The ability to see 
this connection will be important in subsequent chapters and it is indeed quite 
simple. Suppose we form a state containing one quantum of the é field, with 
momentum k’: 


|) = Nat(k’)|0) (5.143) 


where N is a normalization constant. Now consider the amplitude (0|4(, t)|k’). 
We expand this out as 


(0|d(z, t)|k’) = (0| E = la [a (tj trio + at (k)e e+") at (410). 
(5.144) 
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The ‘ata'’ term will give zero since (0|a' = 0. For the other term we use the 
commutation relation (5.130) to write it as 


(of tara) + 264 — kjee- = NE 
——|a a TrÓ(k — ke = N—— 
27V 2w yV 2w' 


using the vacuum condition once again, and integrating over the ô function 
using the property (5.116) which sets k = k’ and hence w = w’. The vacuum 
is normalized to unity, (0|0) = 1. The normalization constant N can be 
adjusted according to the desired convention for the normalization of the 
states and wavefunctions. The result is just the plane-wave wavefunction for 
a particle in the state |k’)! Thus we discover that the vacuum to one-particle 
matrix elements of the field operators are just the familiar wavefunctions of 
single-particle quantum mechanics. In this connection we can explain some 
common terminology. The path to quantum field theory that we have followed 
is sometimes called ‘second quantization’ — ordinary single-particle quantum 
mechanics being the first-quantized version of the theory. 


(5.145) 


E oQSEPo o ——— ooo ———— 
5.3 Generalizations: four dimensions, relativity and mass 


In the previous section we have shown how quantum mechanics may be mar- 
ried to field theory, but we have considered only one spatial dimension, for 
simplicity. Now we must generalize to three and incorporate the demands of 
relativity. This is very easy to do in the Lagrangian approach, for the scalar 
field ¢(a,t). ‘Scalar’ means that the field has only one independent com- 
ponent at each point (x,t) — unlike the electromagnetic field, for instance, 
for which the analogous quantity has four components, making up a 4-vector 
field A“ (x,t) = (Ao(a,t), A(x, t)) (see chapter 7). In the quantum case, a 
one-component field (or wavefunction) is appropriate for spin-0 particles. 

As we saw in (5.97), the three-dimensional Euler-Lagrange equations are 


OL OL 9 (OL 
—=-W:==-=|=)])= 5.146 
oY BAB lag) saa 
which may immediately be rewritten in relativistically invariant form 
OL OL 
— — ô, | = } = 0 5.147 
35 a (30,5) dai 


where O, = 0/Ox". Similarly, the action 


S= pu | ze = f atec (5.148) 
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will be relativistically invariant if £ is, since the volume element dtz is in- 
variant. Thus, to construct a relativistic field theory, we have to construct 
an invariant density £ and use the already given covariant Euler-Lagrange 
equation. Thus our previous string Lagrangian 


1 (9 1 ap? 
Lp =5P (5) — ape (5) (5.149) 


with p =c= 1 generalizes to 


L= 50,00" (5.150) 
and produces the invariant wave equation 


All of this goes through just the same when the fields are quantized. 

This invariant Lagrangian describes a field whose quanta are massless. 
To find the Lagrangian for the case of massive quanta, we need to find the 
Lagrangian that gives us the Klein-Gordon equation (see section 3.1) 


(0 + m?)¢(ax, t) = 0 (5.152) 


via the Euler-Lagrangian equations. 
The answer is a simple generalization of (5.150): 


Lra = 50,60") — îm. (5.153) 
The plane-wave solutions of the field equation — now the KG equation — have 
frequencies (or energies) given by 
w =k? +m? (5.154) 
which is the correct energy-momentum relation for a massive particle. 


How do we quantize this field theory? The four-dimensional analogue of 
the Fourier expansion of the field ¢ takes the form 


Ñ A ee atn thee 
dite) ata 515 


with a similar expansion for the ‘conjugate momentum’ 7 = ¢: 


(a) = [E imajen atete 
os) ole (bee, (5.156) 


Here k- x is the four-dimensional dot product k- x = wt —k-a, and w = 
+(k? + m?)!/2. The Hamiltonian is found to be 


Ha = | afine =) Pat la? + Ve- Ve + me? (5.157) 
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and this can be expressed in terms of the â's and the a'’s using the expansion 
for ¢ and 7 and the commutator 


[a(k), âi (k’)] = (27)353(k — k’) (5.158) 
with all others vanishing. The result is, as expected, 


Ha = 5 | lala) + â(k)â! (k)]w (5.159) 


and, normally ordering as usual, we arrive at 


. 3 
Hxa = J Ea alt (5.160) 


This supports the physical interpretation of the mode operators ât and â as 
creation and destruction operators for quanta of the field $ as before, except 
that now the energy-momentum relation for these particles is the relativistic 
one, for particles of mass m. 

Since ¢ is real ($ = Și) and has no spin degrees of freedom, it is called 
a real scalar field. Only field quanta of one type enter — those created by 
ât and destroyed by â. Thus $ would correspond physically to a case where 
there was a unique particle state of a given mass m — for example the 7° field. 
Actually, of course, we would not want to describe the 7° in any fundamental 
sense in terms of such a field, since we know it is not a point-like object (‘d’ 
is defined only at the single space-time point (a,t)). The question of whether 
true ‘elementary’ scalar fields exist in nature is an interesting one: in the 
Standard Model, as we shall eventually see in volume 2, the Higgs field is a 
scalar field (though it contains several components with different charge). It 
remains to be seen if this field — and the associated quantum, the Higgs boson 
— is a scalar, and if so whether it is elementary or composite. 

We have learned how to describe free relativistic spinless particles of finite 
mass as the quanta of a relativistic quantum field. We now need to understand 
interactions in quantum field theory. 


Á mmm 
Problems 
5.1 Verify equation (5.36). 


5.2 Consider one-dimensional motion under gravity so that V(x) = —mgz in 
(5.39). Evaluate S of (5.38) for tı = 0, t2 = to, for three possible trajectories: 


(a 
(b 
(c 


) a 
) x(t) = gt? (the Newtonian result) and 
) 
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where the constants a and b are to be chosen so that all the trajectories end 
at the same point z(to). 


5.3 
(a) Use (5.57) and (5.63) to verify that 


p = mq 


is consistent with the Heisenberg equation of motion for A=4@. 


(b) By similar methods verify that 


D = -mw?°ĝ. 


(a) Rewrite the Hamiltonian H of (5.63) in terms of the operators â 
and at. 
(b) Evaluate the commutator between â and ât and use this result 
together with your expression for H from part (a) to verify equa- 
tion (5.73). 
(c) Verify that for |n) given by equation (5.81) the normalization con- 
dition 
(nn) = 1 
is satisfied. 
(d) Verify (5.83) directly using the commutation relation (5.72). 
5.5 Treating 4 and w* as independent classical fields, show that the La- 
grangian density i 
L= iy} — (1/2m) Vo" -V 
gives the Schrödinger equation for 4 and 4* correctly. 
5.6 


(a) Verify that the commutation relations for â(k) and â (k) (equations 
(5.130)) are consistent with the equal time commutation relation 
between ¢ and î (equation (5.117)), and with (5.118). 


(b) Consider the unequal time commutator D(x1,12) = [0(a1,t1), 
(x2, t2)|, where ¢ is a massive KG field in three dimensions. Show 
that 

dèk l . 
D = —ik-(a1-a%2) _ pik-(121—u2) 161 
(a.m) = | age chee] (5.161) 


where k - (xy = z2) = E(t, = t2) —k. (aa = £2), and E = (k? + 
m2)1/2. Note that D is not an operator, and that it depends only 
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on the difference of coordinates zı — x2, consistent with translation 
invariance. Show that D(x1,x2) vanishes for tı = to. Explain why 
the right-hand side of (5.161) is Lorentz invariant (see the exercise 
in appendix E), and use this fact to show that D(x, x2) vanishes 
for all space-like separations (xı — x2)? < 0. Discuss the significance 
of this result — or see the discussion in section 6.3.2! 


5.7 Insert the plane-wave expansions for the operators $ and î into the equa- 
tion for H, (5.124), and verify equation (5.132). [Hint: note that w is defined 
to be always positive, so that (5.126) should strictly be written w = |k].] 


5.8 


(a) Use (5.117) and (5.124) to verify that #(x,t) = (a, t) is consistent 
with the Heisenberg equation of motion for (x,t). [Hint: write the 
integral in (5.124) as over y, not a!] 


(b) Similarly, verify the consistency of (5.141) and (5.121). 


6 


Quantum Field Theory II: Interacting Scalar 
Fields 


6.1 Interactions in quantum field theory: qualitative 
introduction 


In the previous chapter we considered only free — i.e. non-interacting — quan- 
tum fields. The fact that they are non-interacting is evident in a number of 
ways. The mode expansions (5.129) and (5.155) are written in terms of the 
(free) plane-wave solutions of the associated wave equations. Also the Hamil- 
tonians turned out to be just the sum of individual oscillator Hamiltonians 
for each mode frequency, as in (5.132) or (5.159). The energies of the quanta 
add up — they are non-interacting quanta. Finally, since the Hamiltonians are 
just sums of number operators 


ñ(k) = al (k)a(k) (6.1) 


it is obvious that each such operator commutes with the Hamiltonian and is 
therefore a constant of the motion. Thus two waves, each with one excitation 
quantum, travelling towards each other will pass smoothly through each other 
and emerge unscathed on the other side — they will not interact at all. 

How can we get the mode quanta to interact? If we return to our dis- 
cussion of classical mechanical systems in section 5.1, we see that the crucial 
step in arriving at the ‘sum over oscillators’ form for the energy was the as- 
sumption that the potential energy was quadratic in the small displacements 
dr. We expect that ‘modes will interact’ when we go beyond this harmonic 
approximation. The same is true in the continuous (wave or field) case. In the 
derivation of the appropriate wave equation you will find that somewhere an 
approximation like tang = $ or sing ~ ¢ is made. This linearizes the equa- 
tion, and solutions to linear equations can be linearly superposed to make new 
solutions. If we retain higher powers of ¢, such as 4%, the resulting nonlinear 
equation has solutions that cannot be obtained by superposing two indepen- 
dent solutions. Thus two waves travelling towards each other will not just 
pass smoothly through each other: various forms of interaction and distortion 
of the original waveforms will occur. 

What happens when we quantize such anharmonic systems? To gain some 
idea of the new features that emerge, consider just one ‘anharmonic oscillator’ 
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with Hamiltonian 

H = (1/2m)p* + mw? k? + Aq’. (6.2) 
In terms of the â and ât combinations this becomes 


= 
(mure 
= Ho+ AH’ (6.4) 


A 1 


H = (a + aa! w + (a+ ât)’ (6.3) 


where Hp is our previous free oscillator Hamiltonian. The algebraic tricks we 
used to find the spectrum of Ho do not work for this new H because of the 
addition of the H’ interaction term. In particular, although Ho commutes with 
the number operator âtâ, HI” does not. Therefore, whatever the eigenstates of 
H are, they will not in general have a definite number of ‘Ho quanta’. In fact, 
we cannot find an exact algebraic solution to this new eigenvalue problem, 
and we must resort to perturbation theory or to numerical methods. 

The perturbative solution to this problem treats AH! as a perturbation 
and expands the true eigenstates of H in terms of the eigenstates of Ho: 


FP) = X ala) (6.5) 


From this expansion we see that, as expected, the true eigenstates |7) will 
‘contain different numbers of Ho quanta’: |crn|? is the probability of finding n 
“Ho quanta’ in the state |r). Perturbation theory now proceeds by expanding 
the coefficients cp and exact energy eigenvalues E, as power series in the 
strength A of the perturbation. For example, the exact energy eigenvalue has 
the expansion 


E, = EO 4 AED + XEO +... (6.6) 
where _ 
Ho|r) = EO |r) (6.7) 
and 
EY = |n) (6.8) 
A H 
ES = Y (r| H’|s)(s| Ir) (6.9) 


(0) (0) 
sZr Er — Es 


To evaluate the second-order shift in energy, we therefore need to consider 
matrix elements of the form 


(s|(â + ât)’ |r). (6.10) 


Keeping careful track of the order of the â and â operators, we can evaluate 
these matrix elements and find, in this case, that there are non-zero matrix 
elements for states (s| = (r + 3|, (r + 1|, (r — 1| and (r — 3}. 
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What about the quantum mechanics of two coupled nonlinear oscillators? 
In the same way, the general state is assumed to be a superposition 


|r) = 5 Cr,nyno|M1)|N2) (6.11) 


nana 


of states of arbitrary numbers of quanta of the unperturbed oscillator Hamil- 
tonians Fo) and Hoo) States of the unperturbed system contain definite 
numbers n and na, say, of the ‘1’ and ‘2’ quanta. Perturbation calculations of 
the interacting system will involve matrix elements connecting such |n)|n2) 
states to states |n')|n5) with different numbers of these quanta. 

All this can be summarized by the remark that the typical feature of 
quantized interacting modes is that we need to consider processes in which 
the numbers of the different mode quanta are not constants of the motion. 
This is, of course, exactly what happens when we have collisions between 
high-energy particles. When far apart the particles, definite in number, are 
indeed free and are just the mode quanta of some quantized fields. But, when 
they interact, we must expect to see changes in the numbers of quanta, and 
can envisage processes in which the number of quanta which emerge finally 
as free particles is different from the number that originally collided. From 
the quantum mechanical examples we have discussed, we expect that these 
interactions will be produced by terms like Q? or GA, since the free — ‘harmonic’ 
— case has 62, analogous to 42 in the quantum mechanics example. Such 
terms arise in the solid state phonon application precisely from anharmonic 
corrections involving the atomic displacements. These terms lead to non- 
trivial phonon-phonon scattering, the treatment of which forms the basis of 
the quantum theory of thermal resistivity of insulators. In the quantum field 
theory case, when we have generalized the formalism to fermions and photons, 
the nonlinear interaction terms will produce ete” scattering, qq annihilation 
and so on. As in the quantum mechanical case, the basic calculational method 
will be perturbation theory. 

As remarked earlier, the trouble with all these ‘real-life’ cases is that they 
involve significant complications due to spin; the corresponding fields then 
have several components, with attendant complexity in the solutions of the 
associated free-particle wave equations (Maxwell, Dirac). So in this chapter 
we shall seek to explain the essence of the perturbative approach to quantum 
field dynamics — which we take to be essentially the Feynman graph version 
of Yukawa’s exchange mechanism — in the context of simple models involving 
only scalar fields; Maxwell (vector) and Dirac (spinor) fields will be introduced 
in the following chapter. The route we follow to the ‘Feynman rules’ is the one 
first given (with remarkable clarity) by Dyson (1949a), which rapidly became 
the standard formulation. 

Before proceeding it may be worth emphasizing that in introducing a ‘non- 
harmonic’ term such as $3 and thus departing from linearity in that sense, 
we are in no way affecting the basic linearity of state vector superposition in 
quantum mechanics (cf (6.11)), which continues to hold. 
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E a 


6.2 Perturbation theory for interacting fields: the Dyson 
expansion of the S-matrix 


On the third day of the journey a remarkable thing happened; going into 
a sort of semi-stupor as one does after 48 hours of bus-riding, I began to 
think very hard about physics, and particularly about the rival radiation 
theories of Schwinger and Feynman. Gradually my thoughts grew more 
coherent, and before I knew where I was, I had solved the problem that 
had been in the back of my mind all this year, which was to prove the 
equivalence of the two theories. 


—From a letter from F. J. Dyson to his parents, 18 September 1948, as 
quoted in Schweber (1994), p 505. 


For definiteness, let us consider the Lagrangian 
L=10,90"4 — 4m?¢? — dd? = La — AR? (6.12) 


with A > 0. Equation (6.12) is like ‘Ê = T—V’ where V = 1(0¢)? tim? + 
re? is the ‘potential’. Though simple, this Lagrangian is unfortunately not 
physically sensible. The classical particle analogue potential would have the 
form V(q) = hug? + Aq?. If we sketch V(q) as a function of q we see that, 
for small A, it retains the shape of an oscillator well near q = 0, but for q 
sufficiently large and negative it will “turn over’, tending ultimately to —oo as 
q > —oo. Classically we expect to be able to set up a successful perturbation 
theory for oscillations about the equilibrium position g = 0, provided that 
the amplitude of the oscillations is not so large as to carry the particle over 
the ‘lip’ of the potential; in the latter case, the particle will escape to q = 
—oo, invalidating a perturbative approach. In the quantum mechanical case 
the same potential V(q) is more problematical, since the particle can tunnel 
through the barrier separating it from the region where V — —oo. This 
means that the ground state will not be stable. An analogous disease affects 
the quantum field case — the supposed vacuum state will be unstable, and 
indeed the energy will not be positive-definite. 

Nevertheless, as the reader may already have surmised, and we shall con- 
firm later in this chapter, the ‘¢-cubed’ interaction is precisely of the form 
relevant to Yukawa’s exchange mechanism. As we have seen in the previ- 
ous section, such an interaction will typically give rise to matrix elements 
between one-quantum and two-quantum states, for example, exactly like the 
basic Yukawa emission and absorption process. In fact, all that is neces- 
sary to make the ¢3-type interaction physical is to let it describe, not the 
‘self-coupling’ of a single field, but the ‘interactive coupling’ of at least two 
different fields. For example, we may have two scalar fields with quanta ‘A’ 
and ‘B’, and an interaction between them of the form AGA p. This will allow 
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processes such as A < A+B. Or we may have three such fields, and an inter- 
action AGA dBc, allowing A + B + C and similar transitions. In these cases 
the problems with the $ self interaction do not arise. (Incidentally those 
problems can be eliminated by the addition of a suitable higher-power term, 
for instance ge’. ) In later sections we shall be considering the ‘ABC’ model 
specifically, but for the present it will be simpler to continue with the single 
field $ and the self-interaction \¢3, as described by the Lagrangian (6.12). 
The associated Hamiltonian is 


A = Axa +H’ (6.13) 


where (as is usual in perturbation theory) we have separated the Hamiltonian 
into a part we can handle exactly, which is the free Klein—Gordon Hamiltonian 


fka = [etic = i [et + (V$)? + m?¢?] (6.14) 
and the part we shall treat perturbatively 
H' = fer = à f aad. (6.15) 


6.2.1 The interaction picture 


We begin with a crucial formal step. In our introduction to quantum field 
theory in the previous chapter, we worked in the Heisenberg picture (HP). 
There, however, we only dealt with free (non-interacting) fields. The time 
dependence of the operators as given by the mode expansion (5.155) is that 
generated by the free KG Hamiltonian (6.14) via the Heisenberg equations 
of motion (see problem 5.8). But as soon as we include the interaction term 
H! , we cannot make progress in the HP, since we do not then know the time 
dependence of the operators — which is generated by the full Hamiltonian 
H= Axe + A. 

Instead, we might consider using the Schrödinger picture (SP) in which 
the states change with time according to 


Apt) = iS lute )) (6.16) 


and the operators are time-independent (see appendix I). Note that although 
(6.16) is a ‘Schrödinger picture’ equation, there is nothing non-relativistic 
about it: on the contrary, H is the relevant relativistic Hamiltonian. In this 
approach, the field operators appearing in the density H are all evaluated at a 
fixed time, say t = 0 by convention, which is the time at which the Schródinger 
and Heisenberg pictures coincide. At this fixed time, mode expansions of the 
form (5.155) with t = 0 are certainly possible, since the basis functions form 
a complete set. 

One problem with this formulation, however, is that it is not going to be 
manifestly ‘Lorentz invariant’ (or covariant), because a particular time (t = 0) 
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has been singled out. In the end, physical quantities should come out correct, 
but it is much more convenient to have everything looking nice and consistent 
with relativity as we go along. This is one of the reasons for choosing to 
work in yet a third ‘picture’, an ingenious kind of half-way-house between 
the other two, called the ‘interaction picture’ (IP). We shall see other good 
reasons shortly. 

In the HP, all the time dependence is carried by the operators and none by 
the state, while in the SP it is exactly the other way around. In the IP, both 
states and operators are time-dependent but in a way that is well adapted 
to perturbation theory, especially in quantum field theory. The operators 
have a time dependence generated by the free Hamiltonian Ho, say, and so a 
‘free-particle’ mode expansion like (5.155) survives intact (here Ho = ika). 
The states have a time dependence generated by the interaction HI. Thus as 
H' — 0 we return to the free-particle HP. 

The way this works formally is as follows. In terms of the time-independent 
SP operator A (cf appendix I), we define the corresponding IP operator Ay (t) 
by Ă Ă 

Ar(t) = eot Ae-iHot, (6.17) 
This is just like the definition of the HP operator A(t) in appendix I, except 
that Ho appears instead of the full Ê. It follows that the time dependence of 
A(t) is given by (1.8) with H > Ap: 


O A Ae. (6.18) 


Equation (6.18) can also, of course, be derived by carefully differentiating 
(6.17). Thus — as mentioned already — the time dependence of Aj(t) is gener- 
ated by the free part of the Hamiltonian, by construction. 

As applied to our model theory (6.12), then, our field $ will now be spec- 
ified as being in the IP, dr (a, t). What about the field canonically conjugate 
to ĝi (t), in the case when the interaction is included? In the HP, as long as 
the interaction does not contain time derivatives, as is the case here, the field 
canonically conjugate to the interacting field remains the same as the free-field 
case: . R 
= ƏL OfxG 

do(z,t)  Ad(a,t) 


so that we continue to adopt the equal-time commutation relation 
lolz, t), 4(y, t)] = id? (æ — y) (6.20) 


for the Heisenberg fields. But the IP fields are related to the HP fields by a 
unitary transformation U, as we can see by combining (6.17) with (1.7): 


A(e,t) = d(x, t) (6.19) 


Âr(t) = eitlote itt A (4) gift iBot 


CAMU (6.21) 
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where U = eiHote-idt, and it is easy to check that UUI =U = Î. So taking 
equation (6.20) and pre-multiplying by U and post-multiplying by U-L on 
both sides, we obtain 


(2,6), ĉr(y, t)] = iô? (æ — y) (6.22) 


showing that, in the interacting case, the IP fields Și and 7 obey the free 
field commutation relation. Thus in the IP case the interacting fields obey the 
same equations of motion and the same commutation relations as the free-field 
operators. It follows that the mode expansion (5.155), and the commutation 
relations (5.158) for the mode creation and annihilation operators, can be 
taken straight over for the IP operators. 

We now turn to the states in the IP. To preserve consistency between the 
matrix elements in the Schrédinger and interaction pictures (cf the step from 
(1.6) to (1.7)) we define the corresponding IP state vector by 


IWE = it yt) (6.23) 


in terms of the SP state |/(t)). We now use (6.23) to find the equation of 
motion of |w(t));. We have 


gO = O 


= O) + (Ho + Ile), 
= eot pl) 


= eot f'e itot y(t) 6.24) 
or 
il) = Aw) 6.25) 
where A _ 
At = eiHot q/g Hot 6.26) 


is the interaction Hamiltonian in the interaction picture. The italicised words 
are important: they mean that all operators in HI! have the (known) free-field 
time dependence, which would not be the case for Hl’ in the HP. Thus, as 
mentioned earlier, the states in the IP have a time dependence generated by 
the interaction Hamiltonian, and this derivation has shown us that it is, in 
fact, the interaction Hamiltonian in the IP which is the appropriate generator 
of time change in this picture. 

Equation (6.25) is a slightly simplified form of the Tomonaga—Schwinger 
equation, which formed the starting point of the approach to QED followed by 
Schwinger (Schwinger 1948b, 1949a, b) and independently by Tomonaga and 
his group (Tomonaga 1946, Koba, Tati and Tomonaga 1947a, b, Kanesawa 
and Tomonaga 1948a, b, Koba and Tomonaga 1948, Koba and Takeda 1948, 
1949). 
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6.2.2 The S-matrix and the Dyson expansion 


We now start the job of applying the IP formalism to scattering and decay 
processes in quantum field theory, treated in perturbation theory; for this, 
following Dyson (1949a, b), the crucial quantity is the scattering matrix, or 
S-matrix for short, which we now introduce. A scattering process may plau- 
sibly be described in the following terms. At a time t > —oo, long before any 
interaction has occurred, we expect the effect of Ht to be negligible so that, 
from (6.25), |w(—00))1 will be a constant state vector |i), which is in fact an 
eigenstate of Hp. Thus |i) will contain a certain number of non-interacting 
particles with definite momenta, and |7(—oo)); = |i). As time evolves, the 
particles approach each other and may scatter, leading in the distant future 
(at t — 00) to another constant state |y(co)); containing non-interacting par- 
ticles. Note that |¢(oo)); will in general contain many different components, 
each with (in principle) different numbers and types of particle; these different 
components in |/(00))1 will be denoted by |f}. The S-operator is now defined 
via A _ 

|b(c0))1 = Slp(—oo)) = Sli). (6.27) 
A particular S-matriz element is then the amplitude for finding a particular 
final state |f) in |y(oo))r: 


(fp(00)) = (£1Sli) = Sa. (6.28) 
Thus we may write 
(co) = YU If) (Ehp(oo)) = Y Sal (6.29) 
f 


It is clear that it is these S-matrix elements Sg that we need to calculate, and 
the associated probabilities |Sg|?. 

Before proceeding we note an important property of Ss. Assuming that 
hp(co0))1 and |i) are both normalized, we have 


1 = 1(¢b(00)|b(00))1 = (1151 Si) = (li) (6.30) 
implying that S is unitary: Ss =T. Taking matrix elements of this gives us 
the result 

NO Sir = da. (6.31) 
k 


Putting i = f in (6.31) yields >>, |Ski|? = 1, which confirms that the expansion 
coefficients in (6.29) must obey the usual condition that the sum of all the 
partial probabilities must add up to 1. Note, however, that in the present case 
the states involved may contain different numbers of particles. _ 


We set up a perturbation-theory approach to calculating S as follows. 
Integrating (6.25) subject to the condition at t > —oo yields 


Wu = |i) if FLY) Yi de. (6.32) 
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This is an integral equation in which the unknown |4(t))1 is buried under 
the integral on the right-hand side, rather similar to the one we encounter in 
non-relativistic scattering theory (equation (H.12) of appendix H). As in that 
case, we solve it iteratively. If Ht is neglected altogether, then the solution is 


WOP = hi) (6.33) 
To get the first order in É? correction to this, insert (6.33) in place of |w(t”))1 
on the right-hand side of (6.32) to obtain 


t 


won”? =i + | (¡ÉL (t1))ata io (6.34) 


— 00 


recalling that |i) is a constant state vector. Putting this back into (6.32) yields 
hp(t)) correct to second order in Hy: 


we)? = {1+ | (—i ff} (t1)) dh 


4 Je a / < (e) Ate) bh (6.35) 


which is as far as we intend to go. Letting t > oo then gives us our perturbative 
series for the S-operator: 


j 00 a oo tı . A 
S=1+ / (—i Hy (ta )) dt, + / dt, i dtə (¡Hp (t1) (14 (t2)) +- 
(6.36) 
with the dots indicating the higher-order terms, which are in fact summarized 
by the full formula 


co oo ti ta—1 M A j 
¿af at, | da | dtn Ay(t1)Hi (te)... (tn). (6.37) 
n=0 — 00 =o — 00 


We could immediately start getting to work with (6.37), but there is one 
more useful technical adjustment to make. Remembering that 


Hi(t) = | rute t) Pa (6.38) 
we can write the second term of (6.36) as 
/ J Pedala (6.39) 
ti >t> 
which looks much more symmetrical in æ — t. However, there is still an awk- 


ward asymmetry between the a-integrals and the t-integrals because of the 
tı > t2 condition. The t-integrals can be converted to run from —oo to oo 
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without constraint, like the a ones, by a clever trick. Note that the ordering of 
the operators Af is significant (since they will contain non-commuting bits), 
and that it is actually given by the order of their time arguments, ‘earlier’ 
operators appearing to the right of ‘later’ ones. This feature must be pre- 
served, obviously, when we let the t-integrals run over the full infinite domain. 
We can arrange for this by introducing the time-ordering symbol T, which is 
defined by 


TAi Hil) = Hie Hal 
= Hy(x2)Hy( 


x2) for ty > ta 
21) for ty < ta (6.40) 


and similarly for more products, and for arbitrary operators. Then (see prob- 
lem 6.1) (6.39) can be written as 


s/f dz dtro T[(—iH}(a1))(—iH}(22))] (6.41) 


where the integrals are now unrestricted. Applying a similar analysis to the 
general term gives us the Dyson expansion of the S operator: 


n 


¿= Y O [fates dira... dtz TELA A). 


n 
n=0 


(6.42) 

This fundamental formula provides the bridge leading from the Tomonaga- 

Schwinger equation (6.25) to the Feynman amplitudes (Feynman 1949a, b), 
as we shall see in detail in section 7.3.2 for the ‘ABC’ case. 


6.3 Applications to the ‘ABC’ theory 


As previously explained, the simple self interacting $ theory is not respectable. 
Following Griffiths (2008) we shall instead apply the foregoing covariant per- 
turbation theory to a hypothetical world consisting of three distinct types of 
scalar particles A, B and C, with masses ma, mp, mc. Each is described by 
areal scalar field which, if free, would obey the appropriate KG equation; the 
interaction term is go AGBÓC- We shall from now on omit the IP subscript ‘T’, 
since all operators are taken to be in the IP. Thus the Hamiltonian is 


A = H)+H' (6.43) 


where 
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and 


H! =g | Px dréndc = [vow (6.45) 


Each field ĝ;, (i = A,B,C) has a mode expansion of the form (5.143), and 
associated creation and annihilation operators al and â; which obey the com- 
mutation relations 


[ai(k), â$ (k’)] = 2m)%6%(k — Răi;  ¡i,j=A,B,C. (6.46) 
The new feature in (6.46) is that operators associated with distinct particles 
commute. In a similar way, we also have [4;,4;] = fâ}, ât] =0. 


6.3.1 The decay C > A+B 


As our first application of (6.42), we shall calculate the decay rate (or reso- 
nance width) for the decay C + A+B, to lowest order in g. Admittedly this is 
not yet a realistic, physical, example; even so, the basic steps in the calculation 
are common to more complicated physical examples, such as WT —> e7 + De. 

We suppose that the initial state |i) consists of one C particle with 4- 
momentum pc, and that the final state in which we are interested is that with 
one A and one B particle present, with 4-momenta pa and pp respectively. 
We want to calculate the matrix element 


Sa = (pa, pBlSlpo) (6.47) 


to lowest order in g. (Note that the ‘1’ term in (6.36) cannot contribute here 
because the initial and final states are plainly orthogonal.) This means that 
we need to evaluate the amplitude 


AY = —ig(pa, pa] J dtz ĝa (2)B (2)de (2)]po). (6.48) 


To proceed we need to decide on the normalization of our states |p;). We will 
define (for i = A,B,C) 


Ipi) = V2E:â} (pi)|0) (6.49) 
where E; = \/m? + p?, so that (using (6.46)) 
(pi|pi) = 2E,(27)%8*(p; — p;)- (6.50) 


The quantity E,0*(p) — p;) is Lorentz invariant. Note that the completeness 
relation for such states reads 


3 
| Giant =l (6.51) 


where the ‘1’ on the right-hand side means the identity in the subspace of 
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such one-particle states, and zero for all other states. The normalization 
choice (6.49) corresponds (see comment (5) in section 5.2.5) to a wavefunction 
normalization of 2E; particles per unit volume. 

Consider now just the dc(2)|pc) piece of (6.48). This is 


3 
J tay ZE [ac(k)e** + ag (he *] /2E cae (pc)|0) (os) 


where k = (Ep, k) and Ep = \/k* +m2. The term with two als will give 
zero when bracketed with a final state containing no C particles. In the other 
term, we use (6.46) together with @c(k)|0) = 0 to reduce (6.52) to 


dk 1 À s 
2 353 —k 92E —ik-x — pripc:z i 
| Eg pe- HV 2B) = eeto) (6.53) 
where po = (ype + mg, po). In exactly the same way we find that, when 
bracketed with an initial state containing no A’s or B’s, 


(pa, PBlea(x)on(a) = (0e? teire, (6.54) 


Hence the amplitude (6.48) becomes just 
AS FL = —ig(2m)%6%(pA +pp—pc). (6.55) 


Unsurprisingly, but reassuringly, we have discovered that the amplitude van- 
ishes unless the 4-momentum is conserved via the 6-function condition: po = 
PA + PB. 

It is clear that such a transition will not occur unless mc > Ma + Mp 
(in the rest frame of the C, we need mo = y m4 + p? + \/ms + p?), so let 
us assume this to be the case. We would now like to calculate the rate for 
the decay C + A+B. To do this, we shall adopt a plausible generalization 
of the ordinary procedure followed in quantum mechanical time-dependent 
perturbation theory (the reader may wish to consult section H.3 of appendix H 
at this point, to see a non-relativistic analogue). The first problem is that 
the transition probability AW apparently involves the square of the four- 
dimensional 6-function. This is bad news, since (to take a simple case, and 
using (E.53)) 9(x — ajó(x — a) = (x — ajó(0) and 6(0) is infinite. In our 
case we have a four-fold infinity. This trouble has arisen because we have 
been using plane-wave solutions of our wave equation, and these notoriously 
lead to such problems. A proper procedure would set the whole thing up 
using wave packets, as is done, for instance, in Peskin and Schroeder (1995), 
section 4.5. An easier remedy is to adopt ‘box normalization’, in which we 
imagine that space has the finite volume V, and the interaction is turned on 
only for a time T. Then ‘(27)45+(0)’ is effectively ‘VT’ (see Weinberg (1995, 
section 3.4)). Dividing this factor out, the transition rate per unit volume is 
then 

Ps =|AQ??/VT = (2) (pa + pp — po)|Mal? (6.56) 
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where (cf (6.55)) 
A = (2n)464(pa + pp — po)iMa (6.57) 


so that the invariant amplitude iMg is just —ig, in this case. 

Equation (6.56) is the probability per unit time for a transition to one 
specific final state |f). But in the present case (and in all similar ones with at 
least two particles in the final state), the A + B final states form a continuum, 
and to get the total rate T we need to integrate Pg over all the continuum 
of final states, consistent with energy-momentum conservation. The corre- 
sponding differential decay rate dI is defined by dr = PadN; where df is 
the number of final states, per particle, lying in a momentum space volume 
dépad3pg about pa and pp. For the normalization (6.49), this number is 


dpa pp 


dN; = a Se, 
î  (0m)52EA (27)32Ep 


(6.58) 


Finally, to get a normalization-independent quantity we must divide by the 
number of decaying particles per unit volume, which is 2Ec. Thus our final 
formula, for the decay rate is 


dipa d “pg 


(27)52EA (27)32Ep ' 48.99) 


1 
T= [a= a en | 5"(s-+P2—Po)| Mal 
C 


Note that the ‘d3p/2E” factors are Lorentz invariant (see the exercise in ap- 
pendix E) and so are all the other terms in (6.59) except Ec, which contributes 
the correct Lorentz-transformation character for a rate (i.e. rate x 1/7). 

We now calculate the total rate [ in the rest frame of the decaying C 
particle. In this case, the 3-momentum part of the ôt gives pa + pg = 0, so 
PA = P = —Pp, and the energy part becomes ó(E — mc) where 


E =,/ m} +p?+4/m3 +p = Ea + Ep. (6.60) 


So the total rate is 


Ie q d°p 
ae | i = me): 
2mo ioe | 525 iste) (6.61) 
Differentiating (6.60) we find 
pl „ pl [p|E 
Ba | Alp El = 22 dipl. 
ap = (2+) alpi = PE ap (6.62) 
Thus we may write 
EAE 
dp = 4n|p|? dlp| = 47|pl = de (6.63) 


E 
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and use the energy 6-function in (6.61) to do the dE integral yielding finally 


2 

g |p] 

Co. 64 

87 me, (6.64) 

The quantity |p| is actually determined from (6.60) now with E = mc; after 
some algebra, we find (problem 5.2) 


lp] = [má + m4 + mé — 2mâm2 — 2m2m2, — 2mm2]!/2/2me. (6.65) 


Equation (6.64) is the result of an ‘almost real life’ calculation and a num- 
ber of comments are in order. First, consider the question of dimensions. In 
our units A = c = 1, I as an inverse time should have the dimensions of a 
mass (see appendix B), which can also be understood if we think of T as the 
width of an unstable resonance state. This requires ‘g’ to have the dimensions 
of a mass, i.e. g ~ M in these units. Going back to our Hamiltonian (6.44) 
and (6.45), which must also have dimensions of a mass, we see from (6.44) 
that the scalar fields $, ~ M (using dix ~ M73), and hence from (6.45) 
g ~ M as required. It turns out that the dimensionality of the coupling con- 
stants (such as g) is of great significance in quantum field theory. In QED, 
the analogous quantity is the charge e, and this is dimensionless in our units 
(a = e? /4r = 1/137, see appendix C). However, we saw in (1.31) that Fermi’s 
‘four-fermion’ coupling constant G had dimensions ~ M~?, while Yukawa's 
‘gn’ and ‘g” (see figure 1.4) were both dimensionless. In fact, as we shall 
explain in section 11.8, the dimensionality of a theory’s coupling constant is 
an important guide as to whether the infinities generally present in the theory 
can be controlled by renormalization (see chapter 10) or not: in particular, 
theories in which the coupling constant has negative mass dimensions, such as 
the ‘four-fermion’ theory, are not renormalizable. Theories with dimension- 
less coupling constants, such as QED, are generally renormalizable, though 
not invariably so. Theories whose coupling constants have positive mass di- 
mension, as in the ABC model, are ‘super-renormalizable’, meaning (roughly) 
that they have fewer basic divergences than ordinary renormalizable theories 
(see section 11.8). 

In the present case, let us say that the mass of the decaying particle mc, 
‘sets the scale’ for g, so that we write g = gmc and then 


g? 
T 


where g is dimensionless. Equation (6.66) shows us nicely that T is simply 
proportional to the energy release in the decay, as determined by |p| (one often 
says that T is determined ‘by the available phase space’). If mg is exactly 
equal to ma + mp, then |p| vanishes and so does I. At the opposite extreme, 
if ma and mg are negligible compared to mc , we would have 


~2 
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Equation (6.67) shows that, even if g?/167 is small (~ 1/137 say) T can still 
be surprisingly large if mc is, as in W~ +e” + De for example. 


6.3.2 A +B — A + B scattering: the amplitudes 


We now consider the two-particle + two-particle process 
A+B>A+B (6.68) 


in which the initial 4-momenta are pa, pg and the final 4-momenta are pa, 
Pg so that pa + pp = Pa + pp. Our main task is to calculate the matrix 
element (pr, pr 1S lPa,pB) to lowest non-trivial order in g. The result will 
be the derivation of our first ‘Feynman rules’ for amplitudes in perturbative 
quantum field theory. 

The first term in the S-operator expansion (6.42) is ‘1’, which does not 
involve g at all. Nevertheless, it is a useful exercise to evaluate and understand 
this contribution (which in the present case does not vanish), namely 


(0lâa (Ph an (Pp) 44 (pa ăi, (pB)l0) (1664 EB Eh Ep)". (6.69) 


We shall have to evaluate many such vacuum expectation values (vev) of prod- 
ucts of @'’s and â's. The general strategy is to commute the ât’s to the left, 
and the â's to the right, and then make use of the facts 


(ojal = a;|0) = 0 (6.70) 


for any i = A,B,C. Thus, remembering that all ‘A’ operators commute with 
all ‘B’ ones, the vev in (6.69) is equal to 


(laa (pa â$ (pa){(277)°53 (py — ph) + âf (pe)an (pf) }10) 
= (OEP, — pl) + âi (pa)âa (rn) Hz? (Pp — pl)10) 


= (20) (pa — pa)(27)*5 (pp — Pp). (6.71) 
The 6-functions enforce Ea = E, and Eg = Ex so that (6.69) becomes 
2B (20) 5 (pa — Py)2Ep(27)°d°(py — PB), (6.72) 


a result which just expresses the normalization of the states, and the fact 
that, with no ‘g’ entering, the particles have not interacted at all, but have 
continued on their separate ways, quite unperturbed (py = Ph, Pp = Pb). 
This contribution can be represented diagrammatically as figure 6.1. 

Next, consider the term of order g, which we used in C > A + B. This is 


~ig J dtz (ph, pblĝa (£)ĝe (0) dc(a)lpa, pe). (6.73) 


We have to remember, now, that all the ĝi operators are in the interaction 
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Pa PA = PA 
—_— A ———. 
— 

Pg Py =P 


FIGURE 6.1 
The order g? term in the perturbative expansion: the two particles do not 
interact. 


picture and are therefore represented by standard mode expansions involving 
the free creation and annihilation operators al and á;, i.e. the same ones used 
in defining the initial and final state vectors. It is then obvious that (6.73) 
must vanish, since no C-particle exists in either the initial or final state, and 
(0|ecl0) = 0. 

So we move on to the term of order g?, which will provide the real meat 
of this chapter. This term is 


E ff ato, ata (Olan (Jan) 
x T {a (x1) bp (1)$c(21) ba (@2) bn (x2) bc(x2)} 
x â) (paJal (pp)|0) (16, Ep E, Eh)". (6.74) 


The vev here involves the product of ten operators, so it will pay us to pause 
and think how such things may be efficiently evaluated. 
Consider the case of just four operators 


(O) ABCD|0) (6.75) 


where each of A, B A C, Dis an aj, an a! or a linear combination of these. Let 
) ? ? 4 


A have the generic form A = â + ât. Then (using (OJa? = al0) = 0) 
(Q|ABCD|0) = (0|âBCD|0) 
(0|[a, BCD]|0). (6.76) 
Now it is an algebraic identity that 
lâ, BCD] = [a, BICD + Bla, C]D + BC{a, Dl. (6.77) 
Hence 
(0| ABCDIO) = [a, BI(OJÉ DIO) + [â, C](0| BDO) + [a, D](0|BC|0), (6.78) 


remembering that all the commutators — if non-vanishing — are just ordinary 
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numbers (see (6.46)). We can rewrite (6.78) in more suggestive form by noting 
that 
lâ, B] = (0l[â, B]/0) = (0|4B|0) = (0| 4810). (6.79) 


Thus the vev of a product of four operators is just the sum of the products 
of all the possible pairwise ‘contractions’ (the name given to the vev of the 
product of two fields): 


(0|ABCD|0) = (0|AB|0)(0|CD|0) + (0| AC|0) (0|BD|0) + (0| AD|O)(0|BC|0). 

(6.80) 
This result generalizes to the vev of the product of any number of operators; 
there is also a similar result for the vev of time-ordered products of operators, 
which is known as Wick’s theorem (Wick 1950), and is indispensable for a 
general discussion of quantum field perturbation theory. 

Consider then the application of (6.80), as generalized to ten operators, 
to the vev in (6.74). The only kind of non-vanishing contractions are of the 
form (0|â;â!]0). Thus the contractions of A-, B- and C-type operators can be 
considered separately. As far as the C-operators are concerned, then, we can 
immediately conclude that the only surviving contraction is 


(0|T'(¢c(x1)dc(x2))|0). (6.81) 


This quantity is, in fact, of fundamental importance: it is called the Feynman 
propagator (in coordinate space) for the spin-0 C-particle. We shall derive 
the mathematical formula for it in due course, but for the moment let us 
understand its physical significance. Each of the $c’s in (6.81) can create 
or destroy C-quanta, but for the vev to be non-zero anything created in the 
‘initial’ state must be destroyed in the ‘final’ one. Which of the times ti and 
tə is initial or final is determined by the T-ordering symbol: for tı > ta, a C- 
quantum is created at xa and destroyed at zi, while for ti < tg a C-quantum 
is created at xı and destroyed at x2. Thus the amplitude (6.81) may be 
represented pictorially as in figure 6.2, where time increases to the right, and 
the vertical axis is a one-dimensional version of three-dimensional space. It 
seems reasonable, indeed, to call this object the ‘propagator’, since it clearly 
has to do with a quantum propagating between two space-time points. 

We might now worry that this explicit time-ordering seems to introduce a 
Lorentz non-invariant element into the calculation, ultimately threatening the 
Lorentz invariance of the $-operator (6.42). The reason that this is in fact not 
the case exposes an important property of quantum field theory. If the two 
points zı and x2 are separated by a time-like interval (i.e. (xı — 22)? > 0), 
then the time-ordering is Lorentz invariant; this is because no proper Lorentz 
transformation can alter the time-ordering of time-like separated events (here, 
the events are the creation/annihilation of particles/antiparticles at xı and 
x2). By ‘proper’ is meant a transformation that does not reverse the sense of 
time; the behaviour of the theory under time-reversal is a different question 
altogether, discussed earlier in section 4.2.4. The fact that time-ordering is 
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b t ty h 
(a) t, >t, (b) t, <t, 
FIGURE 6.2 


C-quantum propagating (a) for tı > ta (from za to 11) and (b) tı < tə (from 
zı to 22). 


invariant for time-like separated events is what guarantees that we cannot 
influence our past, only our future. But what if the events are space-like 
separated, (11 — 22)? < 0? We know that the scalar fields ¢;(x1) and ¢;(x2) 
commute for equal times: remarkably, one can show (problem 5.6(b)) that 
they also commute for (11 — 22)? < 0; so in this sector of 1, — 12 space 
the time-ordering symbol is irrelevant. Thus, contrary to appearances, the 
T-product vev is Lorentz invariant. For the same reason, the $ operator of 
(6.42) is also Lorentz invariant: see, for example, Weinberg (1995, section 3.5). 
The property 


[ós(21),6:(02)) =0 for (a1 — a2)? < 0 (6.82) 


has an important physical interpretation. In quantum mechanics, if operators 
representing physical observables commute with each other, then measure- 
ments of either observable can be performed without interfering with each 
other; the observables are said to be ‘compatible’. This is just what we would 
want for measurements done at two points which are space-like separated — 
no signal with speed less than or equal to light can connect them, and so we 
would expect them to be non-interfering. Condition (6.82) is often called a 
‘causality’ condition. 

More mathematically, the amplitude (6.81) is in fact a Green function for 
the KG operator (O + m)! (see appendix G, and problem 6.3). That is to 
say, 


(Oz, + m6)OlT(Go(z)pe(22))10) = —id*(a1 — 29). (6.83) 
Actually, problem 6.3 shows that (6.83) is true even when the (0| and |0) 
are removed, i.e. the operator quantity T(dc(x1)¢c(#2)) is itself a KG Green 
function. The work of appendices G and H indicates the central importance 
of such Green functions in scattering theory, so we need not be surprised to 
find such a thing appearing here. 
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Now let us figure out what are all the surviving terms in the vev in (6.74). 
As far as contractions involving âA(p4) are concerned, we have only three 
non-zero possibilities: 


(Olaa(p)@k(pa)|0) — (Ol@a(p'y)ba(xr)/0) (Olâa (py) ba(w2)I0). (6.84) 


There are similar possibilities for al (pa), âB(pp) and al, (pp). The upshot is 
that we have only the following pairings to consider: 


(p)âb(pB)l0) 


(Ofâa (pa )âA (Pa)l0) Olan 
T($a(21)da (x2))|0)(0|T (en (#1) $n (x2))|0)(0|T ($o (z1)$c(x2))|0); 


x (OT (9a (x1) 
(0lâa (P',)44 (Pa)|0) (0lâs (P's) x (21)10) 

x (0lâB(22)âh(pB)]0) (O|T ($c (#1)Go(a2))|0) (0|T (ba (1) ba (22))10) 

+ 11 © T2; (6.86) 
(0/48 (Pp) 4 (PB)|0) (0lâa (P'a )$a (21)10) 

x (0|$a (z2)âh (pa)|0)(O|T (Go(#1)ec(w2))|0)(0|T ($s (a1) bx (2) |0) 

+ 11 + T2; (6.87) 
(Ojaa(paJóa (21)10)(0]9 (w2)44, (Da)|0) (04x (P's) dn (21)10) 

x (0|4x (22)4% (pe)10)(0/T(60(11)60(22))10) 

+ 11 Y Za; (6.88) 
(OlâA (p'a )Êa (a1)|0)(0|Ga (w2)44 (pa)10)(0|á (Pp) dx (22)10) 

x (0|dx (#14) (pp) |0) (O/T ( 

+ %1 Xo. (6.89) 


We already know that quantities like (oja(p',Jal (pa)|0) yield something 
proportional to 6°(p, — p'a) and correspond to the initial A-particle going 
‘straight through’. The other factors in (6.85) which are new are quantities 
like (0|4@(p',)¢a(#1)|0), which has the value (problem 6.4) 


1 
JER 


which is proportional (depending on the adopted normalization) to the wave- 
function for an outgoing A-particle with 4-momentum p% . 

We are now in a position to give a diagrammatic interpretation of all 
of (6.85)-(6.89). In these diagrams, we shall not (as we did in figure 6.2) 
draw two separately time-ordered pieces for each propagator. We shall not 
indicate the time-ordering at all and we shall understand that both time- 
orderings are always included in each propagator line. Term (6.85) then has 
the structure shown in figure 6.3(a); term (6.86) that shown in figure 6.3(b); 
term (6.87) that in figure 6.3(c); term (6.88) that in figure 6.3(d); and term 


(Olâa (ph ĝa (a1)|0) = epa a (6.90) 
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Graphical representation of (6.85)-(6.89): (a) (6.85); (b) (6.86); (c) (6.87); 
(d) (6.88); (e) (6.89). 


(6.89) that in figure 6.3(e). We recognize in figure 6.3(e) the long-awaited 
Yukawa exchange process, which we shall shortly analyse in full — but the 
formalism has yielded much else besides! We shall come back to figures 6.3(a), 
(b) and (c) in section 6.3.5; for the moment we note that these processes do 
not represent true interactions between the particles, since at least one goes 
through unscattered in each case. So we shall concentrate on figures 6.3(d) 
and (e), and derive the Feynman rules for them. 

First, consider figure 6.3(e), corresponding to the contraction (6.89). When 
this is inserted into (6.74), the two terms in which x; and 22 are interchanged 
give identical results (interchanging xı and za in the integral), so the contri- 
bution we are discussing is 


(-ig)? J diz dire ir (QT (dq(a1)bo(@2))|0). (6.91) 


We must now turn our attention, as promised, to the propagator of (6.81), 
(OT (dc(#1)¢c(x2))|0). Inserting the mode expansion (6.52) for each of dc (11) 


and fc (22), and using the commutation relations (6.46) and the vacuum con- 
ditions (6.70) we find (problem 6.5) 


fi fi dk —iw(t1—ta ik. “1-H 
Talla =f as tata 
+ O(t — ty)e7iwe (eta) tik (®2-21)] (6.92) 


where wp = (k? + m2)1/2. This expression is very ‘uncovariant looking’, 


6.8. Applications to the ‘ABC’ theory 169 


due to the presence of the 6-functions with time arguments. But the ear- 
lier discussion, after (6.81), has assured us that the left-hand side of (6.92) 
must be Lorentz invariant, and — by a clever trick — it is possible to recast 
the right-hand side in manifestly invariant form. We introduce an integral 
representation of the 0-function via 


0(t) =i J ee (6.93) 


x 27 z+ i€ 


where e is an infinitesimally small positive quantity (see appendix F). Multi- 
plying (6.93) by e~** and changing z to z + wz in the integral we have 
jf de. 22 
Lo 2T z — (wp — i€) 


(6.94) 


Putting (6.94) into (6.92) then yields 


m 7 d3kdz iz(ti to) +ik. (£1 T2) 
oriceon = i f Mile 


eiz(tı—t2)-ik-(£1-£2) \ 


z — (wk — ie) 


+ (6.95) 


z — (wp — ie) 
The exponentials and the volume element demand a more symmetrical nota- 
tion: let us write ko = z so that (ko = z, k) form the components of a 4-vector 
kt. Note very carefully, however, that ko is not (k? +m2,)1/?! The variable ko 
is unrestricted, whereas it is wx that equals (k? + m2,)!/?. With this change 
of notation, (6.95) becomes 


dk i e ik: (za —a2) gi: (21 —a2) 
(27)4 20 i 


O|T (da(er)¢ 0) = SS TE aBa 
(017 Ġole)êole)0) = f t 

(6.96) 
Changing k + —k (ko > —ko, k > —k) in the second term in (6.96), we 


finally have 
(0|T (dc(#1)$c(w2))|0) 


i de iea A 1 1 
= ——c — => 
(27)4 2wW | ko — (we —ie) ko + wpe —ie 


4 
d k e ik (1102) 1 


27) ke — (we 197 (6.97) 


or 


(OT ($c(21)óc(22))10) =| are ee 


i 


—— ooo 6.98 
ko =k = me + ie ( ) 


1We know that the left-hand side of (6.95) is Lorentz invariant, and that (tı — to, 21 — 
x2) form the components of a 4-vector. The quantities (ko = z, k) must also form the 
components of a 4-vector, in order for the exponentials in (6.95) to be invariant. 
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where in the last step we have used w? = k? + m2, and written ‘ie’ for ‘2iew,’ 
since what matters is just the sign of the small imaginary part (note that wx is 
defined as the positive square root). In this final form, the Lorentz invariance 
of the scalar propagator is indeed manifest. 

We shall have more to say about this propagator (Green function) in sec- 
tion 6.3.3. For the moment we simply note two points: first, it is the Fourier 
transform of i/k? — m2, + ie, as stated in appendix G, where k? = kj — k?; 
and second, it is a function of the coordinate difference zı — x2, as it has to 
be since we do not expect physics to depend on the choice of origin. This 
second point gives us a clue as to how best to perform the x; — x2 integral 
in (6.91). Let us introduce the new variables £x = zı — 12, X = (a1 + 22)/2. 
Then (problem 6.6) (6.91) reduces to 


: dk : 1 
a Ne 4 54 _— el — wl 4 iq-x —ik-x 
CINE a + po — Ph — a) | atei” | ¿Qe 
(6.99) 
. i 
= (—ig)?(2r)*6*(pa + ps — Pa — PB)= (6.100) 


q — má + ie 


where q = Pa — Pp = Pa — Pp is the 4-momentum transfer carried by the 
exchanged C-quantum in figure 6.4, and we have used the four-dimensional 
version of (E.26). We associate this single expression, which includes the 
two coordinate space processes of figure 6.2, with the single momentum-space 
Feynman diagram of figure 6.4. The arrows refer merely to the flow of 4- 
momentum, which is conserved at each ‘vertex’ (i.e. meeting of three lines). 
Thus although the arrow on the exchanged C-line is drawn as indicated, this 
has nothing to do with any presumed order of emission/absorption of the 
exchanged quantum. It cannot do so, after all, since in this diagram the states 
all have definite 4-momentum and hence are totally delocalized in space-time; 
equivalently, we recall from (6.91) that the amplitude in fact involves integrals 
over all space-time. 

A similar analysis (problem 6.7) shows that the contribution of the con- 
tractions (6.88) to the S-matrix element (6.74) is 
12 4 54 ] 1 
(-i9P07)'0%(pa + pe -ph Pe ar (61100 
which is represented by the momentum-space Feynman diagram of figure 6.5. 

At this point we may start to write down the Feynman rules for the ABC 
theory, which enable us to associate a precise mathematical expression for an 
amplitude with a Feynman diagram such as figure 6.4 or figure 6.5. It is clear 
that we will always have a factor (277)45+(pa +pB —p4 — pp) for all ‘connected’ 
diagrams, following from the flow of the conserved 4-momentum through the 
diagrams. It is conventional to extract this factor, and to define the invariant 
amplitude Mg via 

Sg = 6g + i(27) 454 (pe — PIM (6.102) 
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FIGURE 6.4 
Momentum-space Feynman diagram corresponding to the O(g?) amplitude of 
(6.100). 


PA 


FIGURE 6.5 
Momentum-space Feynman diagram corresponding to the O(g?) amplitude of 
(6.101). 
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in general (cf (6.57)). The rules reconstruct the invariant amplitude iMg 
corresponding to a given diagram, and for the present case they are: 


(i) At each vertex, a factor —ig. 


(ii) For each internal line, a factor 


i (6.103) 
q? — m? + i€ i 
where 7 = A,B or C and q; is the 4-momentum carried by that line. 
The factor (6.103) is the Feynman propagator in momentum space, 


for the scalar particle ‘i’. 


Of course, it is no big deal to give a set of rules which will just reconstruct 
(6.100) and (6.101). The real power of the ‘rules’ is that they work for all 
diagrams we can draw by joining together vertices and propagators (except 
that we have not yet explained what to do if more than one particle appears 
‘internally’ between two vertices, as in figures 6.3(a)-(c): see section 6.3.5). 


6.3.3 A+B — A + B scattering: the Yukawa exchange 
mechanism, s and u channel processes 


Referring back to section 1.3.3, equation (1.28), we see that the amplitude for 
the exchange process of figure 6.4 indeed has the form suggested there, namely 
~ g’/(¢? — m¿) if C is exchanged. We have seen how, in the static limit, this 
may be interpreted as a Yukawa interaction of range h/mcc between the par- 
ticles A and B, treated in the Born approximation. Expression (6.100), then, 
provides us with the correct relativistic formula for this Yukawa mechanism. 

There is more to be said about this fundamental amplitude (6.100), which 
is essentially the C propagator in momentum space. While it is always true 
that p? = m? for a free particle of 4-momentum p; and rest mass mj, it is 
not the case that q? = mé in (6.100). We emphasized after (6.95) that the 
variable ko introduced there was not equal to (k? + m2,)!/?, and the result 
of the step (6.99) to (6.100) was to replace ko by qo and k by q, so that 
qo # (q? + m2)1P, i.e. q? = q2 — q? 4 m2. So the exchanged quantum in 
figure 6.4 does not satisfy the ‘mass-shell condition’ p? = m?; it is said to be 
‘off-mass shell’ or ‘virtual’ (see also problem 6.8). It is quite a different entity 
from a free quantum. Indeed, as we saw in more elementary physical terms 
in section 1.3.2, it has a fleeting existence, as sanctioned by the uncertainty 
relation. 

It is convenient, at this point, to introduce some kinematic variables which 
will appear often in following chapters. These are the ‘Mandelstam variables’ 
(Mandelstam 1958, 1959) 


s=(pa+pB) t=(pa—pa)?  u=(pa—pr). (6.104) 


They are clearly relativistically invariant. In terms of these variables the 
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FIGURE 6.6 
O(e?) contribution to ete” — ete” via annihilation to (and re-emission from) 
a virtual y state. 


amplitude (6.100) is essentially ~ 1/(u— m& + ie), and the amplitude (6.101) 
is ~ 1/(s — mé + ie). The first is said to be a ‘u-channel process’, the second 
an ‘s-channel process’. Amplitudes of the form (t — m?)~! or (u — m?) 
are basically one-quantum exchange (i.e. ‘force’) processes, while those of the 
form (s — m2)- 1 have a rather different interpretation, as we now discuss. 

Let us first ask: can s = (pa + pp)” ever equal mă in (6.101)? Since s is 
invariant, we can evaluate it in any frame we like, for example the centre-of- 
momentum (CM) frame in which 


(pa + pB)? = (Ea + Ep)? (6.105) 


with EA = (m2 + p?)'/?, Eg = (m3 + p*)'/?. It is then clear that if mo < 
ma+me the condition (pa+pp)? = mé, can never be satisfied, and the internal 
quantum in figure 6.5 is always virtual (note that pa + pp is the 4-momentum 
of the C-quantum). Depending on the details of the theory with which we 
are dealing, such an s-channel process can have different interpretations. In 
QED, for example, in the process et +e” — et +e7 we could have a virtual y 
s-channel process as shown in figure 6.6. This would be called an ‘annihilation 
process’ for obvious reasons. In the process y+e” — y+e7 , however, we could 
have figure 6.7, which would be interpreted as an absorption and re-emission 
process (i.e. of a photon). 

However, if mc > ma + mp, then we can indeed satisfy (pa + pp)? = me, 
and so (remembering that e is infinitesimal) we seem to have an infinite result 
when s (the square of the CM energy) hits the value m2,. In fact, this is not the 
case. If mc > ma + mg, the C-particle is unstable against decay to A+B, as 
we saw in section 6.3.1. The s-channel process must then be interpreted as the 
formation of a resonance, i.e. of the transitory and decaying state consisting 
of the single C-particle. Such a process would be described non-relativistically 
by a Breit-Wigner amplitude of the form 


Ma 1/(E — En +i0/2) (6.106) 


which produces a peak in |M|? centred at E = Ep and full width T at half- 


174 6. Quantum Field Theory II: Interacting Scalar Fields 


FIGURE 6.7 
O(e?) contribution to ye” — ye” via absorption to (and re-emission from) a 
virtual e~ state. 


height; T is, in fact, precisely the width calculated in section 6.3.1. The 
relativistic generalization of (6.106) is 


1 

Mx ——.——_| 6.107 
*'s— M2 +iMT (PLU 
where M is the mass of the unstable particle. Thus in the present case the 
prescription for avoiding the infinity in our amplitude is to replace the in- 
finitesimal ‘ie’ in (6.101) by the finite quantity imcI, with T as calculated 
in section 6.3.1. We shall see examples of such s-channel resonances in sec- 

tion 9.5. 


6.3.4 A+B — A + B scattering: the differential 
cross section 


We complete this exercise in the ‘ABC’ theory by showing how to calculate the 
cross section for A+B— A+B scattering in terms of the invariant amplitude 
Mg of (6.102). The discussion will closely parallel the calculation of the decay 
rate I’ in section 6.3.1. 

As in (6.56), the transition rate per unit volume, in this case, is 


Pa = (2n)"5%(pa + pp — ph — pl) Mal’. (6.108) 


In order to obtain a quantity which may be compared from experiment to 
experiment, we must remove the dependence of the transition rate on the 
incident flux of particles and on the number of target particles per unit volume. 
Now the flux of beam particles (‘A’ ones, let us say) incident on a stationary 
target is just the number of particles per unit area reaching the target in unit 
time which, with our normalization of ‘2F particles per unit volume’, is just 


lu]2EA (6.109) 


where v is the velocity of the incident A in the rest frame of the target B. 
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The number of target particles per unit volume is 2Eg (= 2mp for B at rest, 
of course). 

We must also include the ‘density of final states’ factors, as in (6.59). 
Putting all this together, the total cross section o is given in terms of the 
differential cross section do by 


1 
= do = 2 4 8t a al 
o = |do= zg | ea tpe- ph- ph) 
Pp, Upa 
(20)22E,, (20)92El, 


1 . 
1E, Epio] f |Msl?dLips(s; ph, Pa), (6.110) 


x |Ma]? 


where we have introduced the Lorentz invariant phase space dLips(s;p',, ph) 
defined by 


y EA dp, 


i 6.111 
Er Eb ( ) 


, 1 
dLips(s; pa, Ph) = ana? Pa + pp — Pa — Ph 


AT 


We can write the flux factor for collinear collisions in invariant form using; the 
relation (easily verified in a particular frame (problem 6.9)) 


Ex Egļu] = [(pa : pa)? — mimg]'”. (6.112) 


Everything in (6.110) is now written in invariant form. 
It is a useful exercise to evaluate YA do in a given frame, and the simplest 
one is the centre-of-momentum (CM) frame defined by 


Pa + Pp = Ph + Pp = 0. (6.113) 


However, before specializing to this frame, it is convenient to simplify our 
expression for dLips. Using the 3-momentum part of the 5-function in (6.110), 
we can eliminate the integral over d*pp: 


d*p; 1 
20 (pa + PB — PA — PB) = pr (Ea + En — EA — Ep), (6.114) 
B B 


remembering also that now pg has to be replaced by pa +pp=—p in Mg. On 


the right-hand side of (6.114), pg and Ej are no longer independent variables 
but are determined by the conditions 


Ph =Pa +Pg-pPh Ee = (m} +p). (6.115) 
Next, convert d*p/, to angular variables 


Pp, = pă d|p | dO. (6.116) 
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The energy E is given by 
EA = (ma + py)? (6.117) 


so that 
EA dE, = |paldipal- (6.118) 


With all these changes we arrive at the result (valid in any frame) 


: 1 |p dE, 
dLips(s; p's, ph) = inp “aS A d06(E, + Eg — Eh — Eh). (6.119) 


We now specialize to the CM frame for which pa =P = —pp, PA =p! = 
—pp, and 
E, = (mp) By = (mẹ +p)!” (6.120) 
so that 
EA dE, = |p'| d|p’| = Ep dEg. (6.121) 
Introduce the variable W” = El + Eh (note that W” is only constrained 


to equal the total energy W = Ea + Ep after the integral over the energy- 
conserving 0-function has been performed). Then (as in (6.62)) 


W'lp'|d|p'| _ W 


= dE, + dEg = FE EE 


dE, (6.122) 


where we have used (6.121) in each of the last two steps. Thus the factor 


A 


Pals A §(E, + Ep — EA — Ep) (6.123) 
becomes 
-5(W — w’) (6.124) 
which reduces to 
[p|/W 


after integrating over W’, since the energy-conservation relation forces |p’| = 
|p|. We arrive at the important result 


Pl 
4r)? W 


dLips(s; ph, PB) = ' (6.125) 


for the two-body phase space in the CM frame. 
The last piece in the puzzle is the evaluation of the flux factor (6.112) in 
the CM frame. In the CM we have 


(Ea, p): (Ep, —p) (6.126) 
= EsEp+p? (6.127) 


PA PB 
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and a straightforward calculation shows that 
(pa + pB)? — mamă = p’ W’. 


Hence we finally have 


1 art | 2 
dQ 12 


and the CM differential cross section is 


(6.129) 


6.3.5 A+B — A + B scattering: loose ends 


We must now return to the amplitudes represented by figures 6.3(a)-(c), 
which we set aside earlier. Consider first figure 6.3(b). Here the A-particle has 
continued through without interacting, while the B-particle has made a virtual 
transition to the ‘A + C’ state, and then this state has reverted to the original 
B-state. So this is in the nature of a correction to the ‘no-scattering’ piece 
shown in figure 6.1, and does not contribute to Mg. However, such a virtual 
transition B > A + C > B does represent a modification of the properties of 
the original single B state, due to its interactions with other fields as specified 
in Hj. We can easily imagine how, at order gt, an amplitude will occur in 
which such a virtual process is inserted into the C propagator in figure 6.4 so 
as to arrive at figure 6.8, from which it is plausible that such emission and 
reabsorption processes by the same particle effectively modify the propagator 
for this particle. This, in turn, suggests that part, at least, of their effect will 
be to modify the mass of the affected particle, so as to change it from the 
original value specified in the Lagrangian. We may think of this physically 
as being associated, in some way, with a particle’s carrying with it a ‘cloud’ 
of virtual particles, with which it is continually interacting; this will affect its 
mass, much as the mass of an electron in a solid becomes an ‘effective’ mass 
due to the various interactions experienced by the electron inside the solid. 

We shall postpone the evaluation of amplitudes such as those represented 
by figures 6.3(b) and (c) to chapter 10. However, we note here just one feature: 
4-momentum conservation applied at each vertex in figure 6.3(b) does not 
determine the individual 4-momenta of the intermediate A and C particles, 
only the sum of their 4-momenta, which is equal to pp (and this is equal to pg 
also, so indeed no scattering has occurred). It is plausible that, if an internal 
4-momentum in a diagram is undetermined in terms of the external (fixed) 4- 
momenta of the physical process, then that undetermined 4-momentum should 
be integrated over. This is the case, as can be verified straightforwardly by 
evaluating the amplitude (6.86), for example, as we evaluated (6.89); a similar 
calculation will be gone through in detail in chapter 10, section 10.1.1. The 
corresponding Feynman rule is 
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FIGURE 6.8 
O(g*) contribution to the process A + B > A + B, in which a virtual transi- 
tion C > A+B- C occurs in the C propagator. 


(iii) For each internal 4-momentum k which is not fixed by 4-momentum 
conservation, carry out the integration f d*k/(2m)*. One such in- 
tegration with respect to an internal 4-momentum occurs for each 


closed loop. 


If we apply this new rule to figure 6.3(b), we find that we need to evaluate 


the integral 
d*k i i 
> TT SS 6.130 
l (27)* (k? — má) ((pB — k)? — ma) dd 


which, by simple counting of powers of k in numerator and denominator, is 
logarithmically divergent. Thus we learn that, almost before we have started 
quantum field theory in earnest, we seem to have run into a serious problem, 
which is going to affect all higher-order processes containing loops. The pro- 
cedure whereby these infinities are tamed is called renormalization, and we 
shall return to it in chapter 10. 

Finally, what about figure 6.3(a)? In this case nothing at all has occurred 
to either of the scattering particles, and instead a virtual trio of A + B + C has 
appeared from the vacuum, and then disappeared back again. Such processes 
are called, obviously enough, vacuum diagrams. This particular one is in 
fact only (another) correction to figure 6.1, and it makes no contribution to 
Ms. But as with figure 6.8, at O(g*) we can imagine such a vacuum process 
appearing “alongside” figure 6.4 or figure 6.5, as in figures 6.9(a) and (b). 
These are called ‘disconnected diagrams’ and — since in them A and B have 
certainly interacted — they will contribute to Mg (note that they are in this 
respect quite different from the “straight through’ diagrams of figures 6.3(b) 
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(a) (b) 


FIGURE 6.9 
O(g*) disconnected diagrams in A + B > A +B. 


and (c)). However, it turns out, rather remarkably, that their effect is exactly 
compensated by another effect we have glossed over — namely the fact that the 
vacuum |0) we have used in our S-matrix elements is plainly the unperturbed 
vacuum (or ground state), whereas surely the introduction of interactions will 
perturb it. A careful analysis of this (Peskin and Schroeder 1995, section 7.2) 
shows that Mg is to be calculated from only the connected Feynman diagrams. 

In this chapter we have seen how the Feynman rules for scattering and 
decay amplitudes in a simple scalar theory are derived, and also how cross 
sections and decay rates are calculated. A Yukawa (u-channel) exchange 
process has been found, in its covariant form, and the analogous s-channel 
process, together with a hint of the complications which arise when loops are 
considered, at higher order in g. Unfortunately, however, none of this applies 
directly to any real physical process, since we do not know of any physical 
‘scalar ABC’ interaction. Rather, the interactions in the Standard Model are 
all gauge interactions similar to electrodynamics (with the exception of the 
Higgs sector, which has both cubic and quartic scalar interactions). The me- 
diating quanta of these gauge interactions have spin-1, not zero; furthermore, 
the matter fields (again apart from the Higgs field) have spin-4. It is time to 
begin discussing the complications of spin and the particular form of dynamics 
associated with the ‘gauge principle’. 


Problems 


6.1 Show that, for a quantum field f (t) (suppressing the space coordinates), 


[at [ata foot) = 4 [at [ata ref) 
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where 


T(f(t1)f (t2)) (ti) f (t2) for tı > te 
Î 


(t1) for tg > ty. 
6.2 Verify equation (6.65). 
6.3 Let (x,t) be a real scalar KG field in one space dimension, satisfying 


i a æ A 
(O, + m?)d(a, t) = (= bat m?) olx,t)=0. 


(a) Explain why 


T(ó(21,t1)ó(x2,t2)) = 0(t1 — te)b(a1, t1)$(T2, to) 


+ 0(t2 — t1)O(@2, ta) O(a1, t1) 


(see equation (E.47) for a definition of the 0-function). 
(b) Using equation (E.46), show that 


d 
qe —a) = ô(x — a). 


(c) Using the result of (b) with appropriate changes of variable, and 
equation (5.118), show that 


fa) P d 
a, ela, ti)ó(xa, t2))) 


= 0(t1 —t2)0(21,t1)0(22, t2) + 0(t2 — t1)ó(22, t)óler, tı). 
(d) Using (5.117) and (5.122) show that 


EAT Olen tôle ta) = iô(£1—x2)ô(tı—t2)+T ($(z1, tr) 4 (a2, t2)) 


and hence show that 


02 ə? - 2 
(= ou m?) T(p(21,t1)p(22,t2)) = —id (x1 —22)0(t1 —t2). 
il! 1 


This shows that T(4(«1,t1)@(a2,t2)) is a Green function (see ap- 
pendix G, equation (G.25) — the i is included here conventionally) 
for the KG operator 


o? ga pm? 
— - — tm. 
Oty? Ox? 
The four-dimensional generalization is immediate. 


6.4 Verify (6.90). 
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6.5 Verify (6.92). 

6.6 Verify (6.99) and (6.100). 

6.7 Show that the contribution of the contractions (6.88) to the S-matrix 
element (6.74) is given by (6.101). 


6.8 Consider the case of equal masses ma = mp = Mc. Evaluate u of (6.104) 
in the CM frame (compare section 1.3.6), and show that u < 0, so that u 
can never equal m2, in (6.100). (This result is generally true for such single 
particle ‘exchange’ processes.) 


6.9 Verify (6.112). 


Taylor & Francis 
Taylor & Francis Group 
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Quantum Field Theory III: Complex Scalar 
Fields, Dirac and Maxwell Fields; 


Introduction of Electromagnetic Interactions 


In the previous two chapters we have introduced the formalism of relativistic 
quantum field theory for the case of free real scalar fields obeying the Klein— 
Gordon (KG) equation of section 3.1, extended it to describe interactions 
between such quantum fields and shown how the Feynman rules for a simple 
Yukawa-like theory are derived. It is now time to return to the unfortunately 
rather more complicated real world of quarks and leptons interacting via gauge 
fields — in particular electromagnetism. For this, several generalizations of the 
formalism of chapter 5 are necessary. 

First, a glance back at chapter 2 will remind the reader that the electro- 
magnetic interaction has everything to do with the phase of wavefunctions, 
and hence presumably of their quantum field generalizations: fields which are 
real must be electromagnetically neutral. Indeed, as noted very briefly in 
section 5.3, the quanta of a real scalar field are their own antiparticles; for 
a given mass, there is only one type of particle being created or destroyed. 
However, physical particles and antiparticles have identical masses (e.g. e” and 
e+), and it is actually a deep result of quantum field theory that this is so (see 
section 4.2.5, and the end of section 7.1). In this case for a given mass m, there 
will have to be two distinct field degrees of freedom, one of which corresponds 
somehow to the ‘particle’, the other to the ‘antiparticle’. This suggests that we 
will need a complex field if we want to distinguish particle from antiparticle, 
even in the absence of electromagnetism (for example, the (KO, K?) pair). Such 
a distinction will have to be made in terms of some conserved quantum number 
(or numbers), having opposite values for ‘particle’ and ‘antiparticle’. This 
conserved quantum number must be associated with some symmetry. Now, 
referring again to chapter 2, we recall that electromagnetism is associated with 
invariance under local U(1) phase transformations. Even in the absence of 
electromagnetism, however, a theory with complex fields can exhibit a global 
U(1) phase invariance. As we shall show in section 7.1, such a symmetry 
indeed leads to the existence of a conserved quantum number, in terms of 
which we can distinguish the particle and antiparticle parts of a complex 
scalar field. 

In section 7.2 we generalize the complex scalar field to the complex spinor 
(Dirac) field, suitable for charged spin-4 particles. Again we find an analogous 
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conserved quantum number, associated with a global U(1) phase invariance of 
the Lagrangian, which serves to distinguish particle from antiparticle. Cen- 
tral to the satisfactory physical interpretation of the Dirac field will be the 
requirement that it must be quantized with anticommutation relations — the 
famous ‘spin-statistics’ connection. 

The electromagnetic field must then be quantized, and section 6.3 describes 
the considerable difficulties this poses. With all this in place, we can easily 
introduce (section 7.4) electromagnetic interactions via the ‘gauge principle’ 
of chapter 2. The resulting Lagrangians and Feynman rules will be applied to 
simple processes in the following chapter. In the final section of this chapter, 
we return to the discrete symmetries of chapter 4, and extend them from the 
single particle theory to quantum field theory. 


E ni 


7.1 The complex scalar field: global U(1) phase 
invariance, particles and antiparticles 


Consider a Lagrangian for two free fields Qi and da having the same mass M: 
Ê = 48,018" h1 — $M? Oi + 50,920" b2 — 4M?92. (7.1) 


We shall see how this is appropriate to a “particle—antiparticle situation. 

In general ‘particle’ and ‘antiparticle’ are distinguished by having opposite 
values of one or more conserved additive quantum numbers. Since these quan- 
tum numbers are conserved, the operators corresponding to them commute 
with the Hamiltonian and are constant in time (in the Heisenberg formulation 
— see equation (5.59)); such operators are called symmetry operators and will 
be increasingly important in later chapters. For the present we consider the 
simplest case in which ‘particle’ and ‘antiparticle’ are distinguished by having 
opposite eigenvalues of just one symmetry operator. This situation is already 
realized in the simple Lagrangian of (7.1). The symmetry involved is just this: 
Ê of (7.1) is left unchanged (is invariant) if fi and ĝ2 are replaced by ȘI. and 
Qh, where (cf (2.64)) 


VA = (cos o) — (sin af» 
ȘI, = (sin a)ĝı + (cosa)$2 


where a is a real parameter. This is like a rotation of coordinates about the z- 
axis of ordinary space, but of course it mixes field degrees of freedom, not spa- 
tial coordinates. The symmetry transformation of (7.2) is sometimes called an 
‘O(2) transformation’, referring to the two-dimensional rotation group O(2). 
We can easily check the invariance of Ê, i.e. 


L(G, 65) = L(61, 02); (7.3) 


(7.2) 


see problem 7.1. 
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Now let us see what is the conservation law associated with this symmetry. 
It is simpler (and sufficient) to consider an infinitesimal rotation characterized 
by the infinitesimal parameter e, for which cos e ~ 1 and sine = e so that (7.2) 
becomes 


a A 
= 1 —€ 
a (7.4) 
by = tepi 
and we can define changes Só; by 
dh = Și — fi = —e$2 
(7.5) 


dba = $h — da = +eb1. 


Under this transformation L is invariant, and so 6£=0. But £ is an explicit 
function of $1, ¢2, 0,91 and O,¢2. Thus we can write 


a al A al OL, 
= L = (01.01) + ———ó0(0 62) + Ae 2. 3, + —003. (7.6) 
Du) Opa) >" > Oe 
This is a bit like the manipulations leading up to the derivation of the Euler— 
Lagrange equations in section 5.2.4, but now the changes 6ș, (i = 1,2) have 
nothing to do with space-time trajectories — they mix up the two fields. How- 
ever, we can use the equations of motion for ¢; and ¢2 to rewrite ôL as 


OL A al a 
= ——d(0 + ——6(0 
DOG) (due) 50,92) (3 P2) 


OL $ OL 
7. ae e 1 On =. 2. . 
i aaa) | + (5 am) ii dili 


Since 6(0,.¢:) = 9,(5$;), the right-hand side of (7.7) is just a total divergence, 
and (7.7) becomes 


OL ; OL xs 
ln | ———b¢1 + ———0¢2| . (7.8) 
(O61) 2(0,62) 
These formal steps are actually perfectly general, and will apply whenever 


a certain Lagrangian depending on two fields ĝı and da is invariant under 
Qi — ĝi +00;. In the present case, with dp, given by (7.5), we have 


DE x Ə ., 
= O =- —_ 601 
i | ES o 
cd, (0"93)61 — (O"b1) G2] (7.9) 


where the free-field Lagrangian (7.1) has been used in the second step. Since 
e is arbitrary, we have proved that the 4-vector operator 


= $106 — $201 (7.10) 
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is conserved: Ă 
ONS = 0); (7.11) 


Such conserved 4-vector operators are called symmetry currents, often denoted 
generically by J”. There is a general theorem (due to Noether (1918) in the 
classical field case) to the effect that if a Lagrangian is invariant under a 
continuous transformation, then there will be an associated symmetry current. 
We shall consider Noether’s theorem again in volume 2. 
What does all this have to do with symmetry operators? Written out in 
full, (7.11) is 
ONG/dt+V- Ng =0. (7.12) 


Integrating this equation over all space, we obtain 
— Na + ÑN¿-dS=0 (7.13) 


where we have used the divergence theorem in the second term. Normally the 
fields may be assumed to die off sufficiently fast at infinity that the surface 
integral vanishes (by using wave packets, for example), and we can therefore 
deduce that the quantity No is constant in time, where 


Ng = [Re Ba (7.14) 


that is, the volume integral of the y = 0 component of a symmetry current is 
a symmetry operator. 

In order to see how No serves to distinguish ‘particle’ from ‘antiparticle’ 
in the simple example we are considering, it turns out to be convenient to 
regard ¢; and 9 as components of a single complex field 

PEE ae 
a (1.5) 
pi = wer + ida). 


The plane-wave expansions of the form (5.155) for ĝi and da imply that $ has 
the expansion 


b= | ae plete e e (7.16) 


where 


(7.17) 


and w = (M? + k2)!/2. The operators â, ât, b, bt obey the commutation 
relations 
[a(k), at (k’)] = (277)°5°(k — k’) 


Si (7.18) 
[b(k), bi (k’)] = (Q7)50%(k — k’) 
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with all others vanishing; this follows from the commutation relations 
[ai(k), a'(k’)] = 5;j(27)°5(k—k’) ete (7.19) 


for the â; operators. Note that two distinct mode operators, â and b, are 
appearing in the expansion (7.16) of the complex field. 
In terms of this complex ¢ the Lagrangian of (7.1) becomes 


Ê = 3 ota" — MÍ (7.20) 
and the Hamiltonian is (dropping the zero-point energy, i.e. normally ordering) 


d°k o 


H= Gays [at (k)a(k) + bt (k)b(k)|w. (7.21) 


The O(2) transformation (7.2) becomes a simple phase change 
$ =e 54 (7.22) 


which (see comment (iii) of section 2.6) is called a global U(1) phase transfor- 
mation; plainly the Lagrangian (7.20) is invariant under (7.22). The associated 
symmetry current N¿ becomes 


Ni = i(Și 96 — po") (7.23) 


and the symmetry operator Ng is (see problem 7.2) 
Ñ; = ie AR tal) — b'(k)b(K)] (7.24) 


Note that No has been normally ordered in anticipation of our later vacuum 
definition (7.30), so that N¿]0) = 0. 

We now observe that the Hamiltonian (7.21) involves the sum of the num- 
ber operators for ‘a’ quanta and ‘b’ quanta, whereas No involves the difference 
of these number operators. Put differently, No counts +1 for each particle of 
type ‘a’ and —1 for each of type ‘b’. This strongly suggests the interpretation 
that the b’s are the antiparticles of the a’s: Ng is the conserved symmetry 
operator whose eigenvalues serve to distinguish them. For a general state, the 
eigenvalue of Ng is the number of a’s minus the number of anti-a’s and it is 
a constant of the motion, as is the total energy, which is the sum of the a 
energies and anti-a energies. 

We have here the simplest form of the particle-antiparticle distinction: 
only one additive conserved quantity is involved. A more complicated example 
would be the (K+, K~) pair, which have opposite values of strangeness and of 
electric charge. Of course, in our simple Lagrangian (7.20) the electromagnetic 
interaction is absent, and so no electric charge can be defined (we shall remedy 
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this later); the complex field $ would be suitable (in respect of strangeness) 
for describing the (K°, K?) pair. 

The symmetry operator N o has a number of further important properties. 
First of all, we have shown that AN, /dt = 0 from the general (Noether) 
argument, but we ought also to check that 


[Ne E] =0 (7.25) 


as is required for consistency, and expected for a symmetry operator. This is 
indeed true (see problem 7.2(a)). We can also show 


[Ñs $] ==0 
eS (7.26) 
[No 61] = ei 
and, by expansion of the exponential (problem 7.2(b)), that 
U(0)$U- (a) = ei = (7.27) 
with N ve 
U(a) =e, (7.28) 


This shows that the unitary operator U(a) effects finite U(1) rotations. 
Consider now a state |N¿) which is an eigenstate of Ny with eigenvalue 
Ng. What is the eigenvalue of Ng for the state ¢|Ny)? It is easy to show, 
using (7.26), that 
Nol No) = (No — 1)4|No) (7.29) 


so the application of ¢ to a state lowers its Ng eigenvalue by 1. This is 
consistent with our interpretation that the db field destroys particles ‘a’ via 
the â piece in (7.16). (This 'f destroys particles’ convention is the reason for 
choosing ¢ = ($1 — id2)/V2 in (7.15), which in turn led to the minus sign in 
the relation (7.26) and to the earlier eigenvalue Ny — 1.) That $ lowers the 
Ng eigenvalue by 1 is also consistent with the interpretation that the same 
field ¢ creates an antiparticle via the bt piece in (7.16). In the same way, by 
considering $!| N), one easily verifies that ¢* increases Ny by 1, by creating 
a particle via ât or destroying an antiparticle via b. The vacuum state (no 
particles and no antiparticles present) is defined by 


a(k)|0) = b(k)|(0) =0 for all k. (7.30) 


As anticipated, therefore, the complex field db contains two distinct kinds 
of mode operator, one having to do with particles (with positive Ng), the 
other with antiparticles (negative Ny). Which we choose to call ‘particle’ and 
which ‘antiparticle’ is of course purely a matter of convention: after all, the 
negatively charged electron is always regarded as the ‘particle’, while in the 
case of the pions we call the positively charged 7? the particle. 
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: > i" : > : 
ty ti ti ta 
(a) (b) 
FIGURE 7.1 
(a) For ti > te, a e particle (Ng = 1) propagates from x2 to za; (b) for t2 > ty 
an anti-¢ particle (Ny = —1) propagates from x, to 22. 


Feynman rules for theories involving complex scalar fields may be derived 
by a straightforward extension of the procedure explained in chapter 6. It 
is, however, worth pausing over the propagator . The only non-vanishing vev 
of the time-ordered product of two ¢ fields is (0|T($(a1)¢' (a2))|0) (the vev’s 
of T(4¢) and T(é'¢) vanish with the vacuum defined as in (7.30)). In sec- 
tion 6.3.2 we gave a pictorial interpretation of the propagator for a real scalar 
field; let us now consider the analogous pictures for the complex field. For 
tı > to the time-ordered product is ¢(21)¢'(x2); using the expansion (7.16) 
and the vacuum conditions (7.30), the only surviving term in the vev is that 
in which an “a!” creates a particle (Ng = 1) at (a2, t2) and an ‘a’ destroys it 
at (a1,t1); the ‘b’ operators in b(x2)t give zero when acting on |0), as do the 
bt” operators in Și (x1) when acting on (0|. Thus for ti > tz we have the pic- 
torial interpretation of figure 7.1(a). For t2 > ti, however, the time-ordered 
product is ¢t(22)(a1). Here the surviving vev comes from the “bi” in (1) 
creating an antiparticle (N¿ = —1) at zi, which is then annihilated by the 
‘b’ in ft (x2). This to > tı process is shown in figure 7.1(b). The inclusion of 
both processes shown in figure 7.1 makes sense physically, following consider- 
ations similar to those put forward ‘intuitively’ in section 3.5.4: the process 
of figure 7.1(a) creates (say) a positive unit of Ny at x2 and loses a positive 
unit at 11, while another way of effecting the same ‘Ng transfer’ is to create 
an antiparticle of unit negative Ng at xı, and propagate it to x2 where it 
is destroyed, as in figure 7.1(b). It is important to be absolutely clear that 
the Feynman propagator (0|T(6(a1)'(a2))|0) includes both the processes in 
figures 7.1(a) and (b). 

In practice, as we found in section 6.3.2, we want the momentum-space 
version of the propagator, i.e. its Fourier transform. As we also noted there 
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(a) (6) 


FIGURE 7.2 
Equivalent Feynman graphs for single W-exchange in Ve +e7 —> Vve +e7. 


(cf also appendix G), the propagator is a Green function for the KG operator 
(+m?) with mass parameter m ; in momentum-space this is just the inverse, 
(—k? + m2)-1. In the present case, since both $ and Și obey the same 
KG equation, with mass parameter M, we expect that the momentum-space 
version of (0|T'(¢(«1)¢"(x2))|0) is also 


i 

k? — M? + ie en) 
This can be verified by inserting the expansion (7.16) into the vev of the 
T-product, and following the steps used in section 6.3.2 for the scalar case. 

In this (momentum-space) version, it is the ‘ie’ which keeps track of the 
‘particles going from 2 to 1 if tı > t2’ and “antiparticles going from 1 to 2 if 
ta > ty’ (recall its appearance in the representation (6.93) of the all-important 
6-function). As in the scalar case, momentum-space propagators in Feynman 
diagrams carry no implied order of emission/absorption process; both the pro- 
cesses in figure 7.1 are always included in all propagators. Arrows showing 
‘momentum flow’ now also show the flow of all conserved quantum numbers. 
Thus the process shown in figure 7.2(a) can equally well be represented as in 
figure 7.2(b). 

There is one more bit of physics to be gleaned from (0|T(4(21)¢" (a2))|0). 
As in the real scalar field case, the vanishing of the commutator at space-like 
separations 

dz), t (z2) =0 for (z1 — 22) < 0 (7.32) 


guarantees the Lorentz invariance of the propagator for the complex scalar 
field and of the S-matrix. But in this (complex) case there is a further twist 
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to the story. Evaluation of [6(21), Și (22) reveals (problem 7.3) that, in the 
region (11 — 22)? < 0, the commutator is the difference of two functions (not 
field operators), one of which arises from the propagation of a particle from x2 
to 11, the other of which comes from the propagation of an antiparticle from 
x1 to xa (just as in figure 7.1). Both processes must exist for this difference 
to be zero, and furthermore for cancellations between them to occur in the 
space-like region the masses of the particle and antiparticle must be identi- 
cal. In quantum field theory, therefore, ‘causality’ (in the sense of condition 
(7.32) — cf (6.82)) requires that every particle has to have a corresponding 
antiparticle, with the same mass and opposite quantum numbers. As we saw 
in chapter 4, these requirements are guaranteed by the CPT theorem, which 
is a consequence of very general principles of quantum field theory. 


ra a 
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I remember that when someone had tried to teach me about creation and 
annihilation operators, that this operator creates an electron, I said ‘how 
do you create an electron? It disagrees with the conservation of charge,’ 
and in that way I blocked my mind from learning a very practical scheme 
of calculation. 


—From the lecture delivered by Richard Feynman in Stockholm, Sweden, 
on 11 December 1965, when he received the Nobel Prize in physics, which 
he shared with Sin-itiro Tomonaga and Julian Schwinger. (Feynman 1966). 


We now turn to the problem of setting up a quantum field which, in its 
wave aspects, satisfies the Dirac equation (cf comment (5) in section 5.2.5), 
and in its ‘particle’ aspects creates or annihilates fermions and antifermions. 
Following the “Heisenberg-Lagrange-Hamilton' approach of section 5.2.5, we 
begin by writing down the Lagrangian which, via the corresponding Euler— 
Lagrange equation, produces the Dirac equation as the ‘field equation’. The 
answer (see problem 7.4) is 


p =ivVij +ivla- Vy — mu! By. (7.33) 

The relativistic invariance of this is more evident in y-matrix notation (prob- 
lem 4.3): 

Lo = (io, — m)y. (7.34) 

We can now attempt to ‘quantize’ the field y by making a mode expansion 

in terms of plane-wave solutions of the Dirac equation, in a fashion similar to 


that for the complex scalar field in (7.16). We obtain (see problem 3.8 for the 
definition of the ho u and v, and the attendant normalization choice) 


> a 27 ae 2 le u(k, sei? + di(k)u(k,s)e**], (7.35) 
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where w = (m? + k”)'/?. We wish to interpret ¿1(k) as the creation operator 
for a Dirac particle of spin s and momentum k. By analogy with (7.16), we 
expect that dt (k) creates the corresponding antiparticle. Presumably we must 
define the vacuum by (cf (7.30)) 


és(k)10) =d,(k)|0) =O forall k and s = 1,2. (7.36) 
A two-fermion state is then 
|k1, $1; ka, sa) x él (k1)é!, (k2)|0). (7.37) 


But it is here that there must be a difference from the boson case. We require 
a state containing two identical fermions to be antisymmetric under the ex- 
change of state labels k © ka, s1 + s2, and thus to be forbidden if the two 
sets of quantum numbers are the same, in accordance with the Pauli exclusion 
principle, responsible for so many well-established features of the structure of 
matter. 

The solution to this dilemma is simple but radical: for fermions, commuta- 
tion relations are replaced by anticommutation relations! The anticommutator 
of two operators A and B is written: 


{A,B} = AB + BA. (7.38) 
If two different ¢’s anticommute, then 
Es, (k1)êl, (ka) + êl, (ha) eh, (kı) = 0 (7.39) 
so that we have the desired antisymmetry 
|, $1; ko, $2) = —|ko, sa; k1, 81). (7.40) 
In general we postulate 
{ês (k1), ĉl, (k2)} = (277)367 (k1 — he) 5s, sz 
(és, (k1), so (k2)} = (él, (k1), él, (k2)} = 0 


and similarly for the d’s and d!'s. The factor in front of the d-function depends 
on the convention for normalizing Dirac wavefunctions. 

We must at once emphasize that in taking this ‘replace commutators by 
anticommutators’ step we now depart decisively from the intuitive, quasi- 
mechanical, picture of a quantum field given in chapter 5, namely as a system 
of quantized harmonic oscillators. Of course, the field expansion (7.35) is 
a linear superposition of ‘modes’ (plane-wave solutions), as for the complex 
scalar field in (7.16) for example; but the ‘mode operators’ é, and di are 
fermionic (obeying anticommutation relations) not bosonic (obeying commu- 
tation relations). As mentioned at the end of section 5.1, it does not seem 
possible to provide any mechanical model of a system (in three dimensions) 


(7.41) 
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whose normal vibrations are fermionic. Correspondingly, there is no con- 
cept of a ‘classical electron field’, analogous to the classical electromagnetic 
field (which doubtless explains why we tend to think of fermions as basically 
‘more particle-like’). However, we can certainly recover a quantum mechani- 
cal wavefunction from (7.35) by considering, as in comment (5) of section 5.4, 
the vacuum-to-one-particle matrix element (0|:h(a, t)|k1, $1). 

In the bosonic case, we arrived at the commutation relations (5.130) for the 
mode operators by postulating the ‘fundamental commutator of quantum field 
theory’, equation (5.117), which was an extension to fields of the canonical 
commutation relations of quantum (particle) mechanics. For fermions, we 
have simply introduced the anticommutation relations (7.41) ‘by hand’, so 
as to satisfy the Pauli principle. We may ask: What then becomes of the 
analogous ‘fundamental commutator’ in the fermionic case? A plausible guess 
is that, as with the mode operators, the ‘fundamental commutator’ is to be 
replaced by a ‘fundamental anticommutator’, between the fermionic field wb 
and its ‘canonically conjugate momentum field’ fp, of the form: 


{(x,t), t(y,t)} = id(z — y). (7.42) 


As far as fp is concerned, we may suppose that its definition is formally 
analogous to (5.122), which would yield 


î = a = ii. (7.43) 
Oy 


We must also not forget that both w and 7p are four-component objects, 
carrying spinor indices. Thus we are led to expect the result 


(bale, t), Yh (y, t)} = 6(@ — u)dap, (7.44) 


where a and f are spinor indices. It is a good exercise to check, using (7.41), 
that this is indeed the case (problem 7.5). We also find 


(h(a, t), (y, t)} = (te, t), 0 (y,t)) =0. (7.45) 


In this (anticommutator) sense, then, we have a ‘canonical’ formalism for 
fermions. 
The Dirac Hamiltonian density is then (cf (5.123)) 


Up = ftp — Lp = ¿ta -—¡Vd + mit Be (7.46) 
using (7.43) and (7.33), and the Hamiltonian is 


Hp = J lýta- —iVj + mit By] da. (7.47) 
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One may well wonder why things have to be this way — ‘bosons commute, 
fermions anticommute’. To gain further insight, we turn again to a consider- 
ation of symmetries and the question of particle and antiparticle — this time 
for the Dirac field, rather than the Dirac wavefunction discussed in chapter 4. 

The Dirac field w is a complex field, as is reflected in the two distinct mode 
operators in the expansion (7.35); as in the complex scalar field case, there 
is only one mass parameter and we expect the quanta to be interpretable as 
particle and antiparticle. The symmetry operator which distinguishes them is 
found by analogy with the complex scalar field case. We note that Lp ( the 
quantized version of (7.34)) is invariant under the global U(1) transformation 


bop = iy (7.48) 


which is a _ _ 
wow = — ie (7.49) 
in infinitesimal form. The corresponding (Noether) symmetry current can be 
calculated as _ oo 
Ke = doth (7.50) 


and the associated symmetry operator is 
Ny = IES (7.51) 


Ny is clearly a number operator for the fermion case. As for the complex 
scalar field, invariance under a global U(1) phase transformation is associated 
with a number conservation law. 

Inserting the plane-wave expansion (7.35), we obtain, after some effort 
(problem 7.6), 


a 3 _ _ 
i J En Y LE (k)ês (k) + ds (1)di (%)]. (7.52) 


s=1,2 


Similarly the Dirac Hamiltonian may be shown to have the form (problem 7.6) 


3 A a 
fy = | En D AOA dead oa (7.53) 


It is important to state that in obtaining (7.52) and (7.53), we have not as- 
sumed either commutation or anticommutation relations for the mode opera- 
tors ĉ, ét, d and di, only properties of the Dirac spinors; in particular, neither 
(7.52) nor (7.53) has been normally ordered. Suppose now that we assume 
commutation relations, so as to rewrite the last terms in (7.52) and (7.53) in 
normally ordered form as di(k)ds(k). We see that Hp will then contain the 
difference of two number operators for ‘c’ and ‘d’ particles, and is therefore 
not positive-definite as we require for a sensible theory. Moreover, we suspect 
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baja? 


that, as in the K case, the ‘d’s’ ought to be the antiparticles of the ‘c’s’, carry- 
ing opposite Nw value: but Ny is then (with the previous mon about 
commutation relations) just proportional to the sum of ‘c’ and ‘d’ number 
operators, counting +1 for each type, which does not fit this interpretation. 
However, if anticommutation relations are assumed, both these problems dis- 
appear: dropping the usual infinite terms, we obtain the normally ordered 
forms 


A 3 A A 
Xe = [SS ll) dido] (759) 


(27) s=1,2 
A 3 A, A 
® = | Er E + dla (7.55) 


which are satisfactory, and allow us to interpret the ‘d’ quanta as the antipar- 
ticles of the ‘c’ quanta. Similar difficulties would have occurred in the complex 
scalar field case if we had assumed anticommutation relations for the boson 
operators, and the ‘causality’ discussion at the end of the preceding section 
would not have worked either (instead of a difference of terms we would have 
had a sum). It is in this way that quantum field theory enforces the connection 
between spin and statistics. 

Our discussion here is only a part of a more general approach leading to 
the same conclusion, first given by Pauli (1940); see also Streater et al. (1964). 

As in the complex scalar case, the other crucial ingredient we need is the 
Dirac propagator (0|T'()(21))(a2))|0). We shall see in section 7.4 why it is 4 
here rather than t — the reason is essentially to do with Lorentz covariance 
(see section 4.1.2). Because the 4 fields are anticommuting, the T-symbol 
now has to be understood as 


Tay) = er) for ty > ts (7.56) 
= —b(22) (x1) for to > tz. (7.57) 


Once again, this propagator is proportional to a Green function, this time 
for the Dirac equation, of course. Using 7-matrix notation (problem 4.3) the 
Dirac equation is (cf (7.34)) 

(8, — m)ý = 0. (7.58) 


The momentum-space version of the propagator is proportional to the inverse 
of the operator in (7.58), when written in k-space, namely to (K —m)~! where 


"hy (7.59) 
is an important shorthand notation (pronounced ‘k-slash’). In fact, the Feyn- 
man propagator for Dirac fields is 

i 


ae (7.60) 
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As in (7.31), the ie takes care of the particle/antiparticle, emission/absorption 
business. Formula (7.60) is the fermion analogue of ‘rule (ii)’ in (6.103). 

The reader should note carefully one very important difference between 
(7.60) and (7.31), which is that (7.60) is a 4x4 matriz. What we are re- 


ally saying (cf (6.98)) is that the Fourier transform of (0|T(a(x1)bg(a2))|0), 
where a and p run over the four components of the Dirac field, is equal to the 


(a, 6) matrix element of the matrix i(f — m + ie)7?: 


[ie — ag) 1-2) (QT (ha (a1)ba(w2))|0) = i(k — m + ie)z}.| (7.61) 


The form (7.61) can be made to look more like (7.31) by making use of the 
result (problem 7.7) 
(K — m)(K +m) = (k? — m”) (7.62) 
(where the 4x4 unit matrix is understood on the right-hand side) so as to 
write (7.61) as 
i(k +m) 
k2 — m2 + ie 
As in the scalar case, (7.61) can be directly verified by inserting the field 
expansion (7.35) into the left-hand side, and following steps analogous to those 
in equations (6.92)—(6.98). In following this through one will meet the expres- 
sions >, u(k, s)u(k,s) and >, u(k, s)u(k, s), which are also 4 x 4 matrices. 
Problem 7.8 shows that these quantities are given by 


Y ualk, sJug(h,s) = (#+m)as J valk, sj0p(h, 8) = (K-m)ag. (7.64) 


(7.63) 


With these results, and remembering the minus sign in (7.57), one can check 
(7.63) (problem 7.9). 

One might now worry that the adoption of anticommutation relations for 
Dirac fields might spoil ‘causality’, in the sense of the discussion after (7.32). 
One finds, indeed, that the fields w and wb anticommute at space-like separa- 
tion, but this is enough to preserve causality for physical observables, which 
will involve an even number of fermionic fields. 

We now turn to the problem of quantizing the Maxwell (electromagnetic) 
field. 


ee 
7.3 The Maxwell field 4*(x) 
7.3.1 The classical field case 


Following the now familiar procedure, our first task is to find the classical field 
Lagrangian which, via the corresponding Euler—Lagrangian equations, yields 
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the Maxwell equation for the electromagnetic potential A”, namely (cf (2.22)) 


AY — OY (8 A") = jem (7.65) 
The answer is (see problem 7.10) 
1 
Lem = qh IP ae JemAv (7.66) 


where Fy = 0,A,—0,A,. So the pure A-field part is the Maxwell Lagrangian 


1 
La = Fa PP. (7.67) 


Before proceeding to try to quantize (7.67), we need to understand some 
important aspects of the free classical field A” (x). 
When jem is set equal to zero, A” satisfies the equation 


„Pr = DA” — ð” (Ə A,) = 0. (7.68) 


As we have seen in section 2.3, these equations are left unchanged if we perform 
the gauge transformation 


A! = A" = AP — Aly, (7.69) 


We can use this freedom to choose the A“ with which we work to satisfy the 


condition 
70 


This is called the Lorentz condition. The process of choosing a particular 
condition on A” so as to define it (ultimately) uniquely is called ‘choosing 
a gauge’; actually the condition (7.70) does not yet define A“ uniquely, as 
we shall see shortly. The Lorentz condition is a very convenient one, since it 
decouples the different components of A“ in Maxwell’s equations (7.68) — in 
a covariant way, moreover, leaving the very simple equation 


Ar =0. (7.71) 
This has plane-wave solutions of the form 
At = Nehg ** (7.72) 


with k? = 0 (ie. k? = k?), where N is a normalization factor and e” is a 
polarization vector for the wave. The gauge condition (7.70) now reduces to 
a condition on e: 

k-e=0. (7.73) 


However, we have not yet exhausted all the gauge freedom. We are still free 
to make another shift in the potential 


AH > A — OU (7.74) 
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provided X satisfies the massless KG equation 


%=0. (7.75) 


This condition on X ensures that, even after the further shift, the resulting 
potential still satisfies 0,, 4 = 0. For our plane-wave solutions, this residual 
gauge freedom corresponds to changing e” by a multiple of k”: 


eh — eh + Bk” = eh (7.76) 


which still satisfies e! - k = 0 since k? = 0 for these free-field solutions. The 
condition k? = 0 is, of course, the statement that a free photon is massless. 
This freedom has important consequences. Consider a solution with 


ku —(k°,k) (60)? =k? 7.77) 
and polarization vector 
e = (e€) 7.78) 
satisfying the Lorentz condition 
k-e=0. 7.79) 


Gauge invariance now implies that we can add multiples of k” to e” and still 
have a satisfactory polarization vector. 

It is therefore clear that we can arrange for the time component of e” to 
vanish so that the Lorentz condition reduces to the 3-vector condition 


k-e=0. (7.80) 
This means that there are only two independent polarization vectors, both 
transverse to k, i.e. to the propagation direction. For a wave travelling in the 


z-direction (k* = (k°,0,0,k°)) these may be chosen to be 


Ea) = (1, 0, 0) (7.81) 
€(2) = (0, 1, 0). (7.82) 


Such a choice corresponds to linear polarization of the associated E and B 
fields — which can be easily calculated from (2.10) and (2.11), given 


A = N(0, eg)? i=1,2. (7.83) 


A commonly used alternative choice is 


e(A = +1) = -=(1,1,0) (7.84) 


dae —(1, —i,0) (7.85) 
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(linear combinations of (7.81) and (7.82)), which correspond to circularly po- 
larized radiation. The phase convention in (7.84) and (7.85) is the standard 
one in quantum mechanics for states of definite spin projection (‘helicity’) 

= +1 along the direction of motion (the z-axis here). We may easily check 
that 


(A) €) = by" (7.86) 


or, in terms of the corresponding 4-vectors e” = (0, €), 
(A) €(A’) = ôy. (7.87) 


We have therefore arrived at the result, familiar in classical electromagnetic 
theory, that the free electromagnetic fields are purely transverse. Though they 
are described in this formalism by a vector potential with apparently four 
independent components (V, A), the condition (7.70) reduces this number by 
one, and the further gauge freedom exploited in (7.74)-(7.76) reduces it by 
one more. 

A crucial point to note is that the reduction to only two independent field 
components (polarization states) can be traced back to the fact that the free 
photon is massless: see the remark after (7.76). By contrast, for massive spin- 
1 bosons, such as the W* and ZO, all three expected polarization states are 
indeed present. However, weak interactions are described by a gauge theory, 
and the W* and ZO particles are gauge-field quanta, analogous to the photon. 
How gauge invariance can be reconciled with the existence of massive gauge 
quanta with three polarization states will be explained in volume 2. 

We may therefore write the plane-wave mode expansion for the classical 
AY (x) field in the form 


dida | (27) n AA (k, AJalk, Ajo + e" (k, Ma? (k, A)e™?] 


(7.88) 
where the sum is over the two possible polarization states À, for given k, as 
described by the suitable polarization vector e” (k, A) and w = |k]. 

It would seem that all we have to do now, in order to ‘quantize’ (7.88), is 
to promote a and a* to operators â and âi, as usual. However, things are 
actually not nearly so simple. 


7.3.2 Quantizing A” (x) 


Readers familiar with Lagrangian mechanics may already suspect that quan- 
tizing A” is not going to be straightforward. The problem is that, clearly, 
A” (x) has four (Lorentz) components — but, equally clearly in view of the 
previous section, they are not all independent field components or field de- 
grees of freedom. In fact, there are only two independent degrees of freedom, 
both transverse. Thus there are constraints on the four fields, for instance the 
gauge condition (7.70). Constrained systems are often awkward to handle in 
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classical mechanics (see for example Goldstein 1980) or classical field theory; 
and they present major problems when it comes to canonical quantization. 
It is actually at just this point that the ‘path-integral’ approach to quantiza- 
tion, alluded to briefly at the end of section 5.2.2, comes into its own. This 
is basically because it does not involve non-commuting (or anticommuting) 
operators and it is therefore to that extent closer to the classical case. This 
means that the relatively straightforward procedures available for constrained 
classical mechanics systems can — when suitably generalized! — be efficiently 
brought to bear on the quantum problem. For an introduction to these ideas, 
we refer to Swanson (1992). 

However, we do not wish at this stage to take what would be a very long 
detour, in setting up the path-integral quantization of QED. We shall continue 
along the ‘canonical’ route. To see the kind of problems we encounter, let us 
try and repeat for the A” field the ‘canonical’ procedure we introduced in 
section 5.2.5. This was based, crucially, on obtaining from the Lagrangian the 
momentum 7 conjugate to ¢, and then imposing the commutation relation 
(5.117) on the corresponding operators # and ¢. But inspection of our Maxwell 
Lagrangian (7.67) quickly reveals that 


Ea = (7.89) 
a Ao 


and hence there is no canonical momentum 7° 


to be stymied before we can even start. 

There is another problem as well. Following the procedure explained in 
chapter 6, we expect that the Feynman propagator for the An field, namely 
(OIT (AY (a1)A”(x2))|0), will surely appear, describing the propagation of a 
photon between x; and x2. In the case of real scalar fields, problem 6.3 
showed that the analogous quantity was actually a Green function for the 
KG differential operator, (O + m?). It turned out, in that case, that what 
we really wanted was the Fourier transform of the Green function, which was 
essentially (apart from the tricky ‘ie prescription’ and a trivial —i factor) the 
inverse of the momentum-space operator corresponding to (O + m°), namely 
(—k?+m?)~! (see equation (6.98) and appendix G, and also (7.58)-(7.60) for 
the Dirac case). Suppose, then, that we try to follow this route to obtaining 
the propagator for the A” field. For this it is sufficient to consider the classical 
equations (7.68) with jen = 0, written in k space (problem 7.11(a)): 


conjugate to A°. We appear 


(—k2g”P + kh’ kM) A, (k) = MPA, (k) =0 (7.90) 


where A,,(k) is the Fourier transform of A,,(x). We therefore require the 
inverse 


(=k? g" +k k")! = (Mor, (7.91) 


Unfortunately it is easy to show that this inverse does not exist. From 
Lorentz covariance, it has to transform as a second-rank tensor, and the only 
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ones available are g*” and k*k”. So the general form of (M~')”“ must be 
(ME) = A(k?)g"* + B(k?)kY k”. (7.92) 
Now the inverse is defined by 
(MP? Mie = g. (7.93) 
Putting (7.92) and (7.90) into (7.93) yields (problem 7.11(b)) 
—k? A(k?)g¥ + A(k?)k’ ko = g% (7.94) 


which cannot be satisfied. So we are thwarted again. 

Nothing daunted, the attentive reader may have an answer ready for the 
propagator problem. Suppose that, instead of (7.68), we start from the much 
simpler equation 


A” =0 (7.95) 


which results from imposing the Lorentz condition (7.70). Then, in momentum- 
space, (7.95) becomes 


—k? A” = 0. (7.96) 


The ‘—k?’ on the left-hand side certainly has an inverse, implying that the 
Feynman propagator for the photon is (proportional to) 9,»/k*. This form 
is indeed plausible, as it is very much what we would expect by taking the 
massless limit of the spin-0 propagator and tacking on guy to account for the 
Lorentz indices in (0|T(A,,(a1)A,(a2))|0) (but then why no term in ky, ky? — 
see the final two paragraphs of this section!). 

Perhaps this approach helps with the ‘no canonical momentum 7°’ problem 
too. Let us ask: What Lagrangian leads to the field equation (7.95)? The 
answer is (problem 7.12) 


1 V 
LL = Fut -= 3(0, Ary. (7.97) 


This form does seem to offer better prospects for quantization, since at least 
all our 7”’s are non-zero; in particular 


o_ OL 


The other m's are unchanged by the addition of the extra term in (7.97) and 
are given by 
qi = -Ät + Qi 40. (7.99) 


Interestingly, these are precisely the electric fields E* (see (2.10)). Let us see, 
then, if all our problems are solved with £z. 
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Now that we have at least got four non-zero 7“’s, we can write down a 
plausible set of commutation relations between the corresponding operator 
quantities 7“ and A”: 


[A, (2, t), ĉu (y, t)] = igu’ (æ — y). (7.100) 


Again, the guy is there to give the same Lorentz transformation character 
on both sides of the equation. But we must now remember that, in the 
classical case, our development rested on imposing the condition 0, A” = 0 
(7.70). Can we, in the quantum version we are trying to construct, simply 
impose ð, AM = 0? We certainly cannot do so in Ê L, or we are back to Ê A 
again (besides, constraints cannot be ‘substituted back’ into Lagrangians, in 
general). Furthermore, if we set u = v = 0 in (7.100), then the right-hand 
side is non-zero while the left-hand side is zero if „Ah = 0 = 0. So it is 
inconsistent simply to set 9, A” =0. 

We will return to the treatment of ‘Op AP = 0’ eventually. First, let us press 
on with (7.97) and see if we can get as far as a (quantized) mode expansion, 
of the form (7.88), for Â” (x). 

To set this up, we need to massage the commutator (7.100) into a form 
as close as possible to the canonical ‘[¢,¢] = id’ form. Assuming the other 
commutation relations (cf (5.118)) 


[A,,(a, t), A, (y,t)] aa [7 ,(x, t), (y, t)] =0 (7.101) 


we see that the spatial derivatives of the A’s commute with the A’s, and with 
each other, at equal times. This implies that we can rewrite the (quantum) 
Ts as A 

Th = A + pieces that commute. (7.102) 


Hence (7.100) can be rewritten as 


[Ay (a, t), Ay(y,t)] = —iguvó (a — y) (7.103) 


and (7.101) remains the same. Now (7.103) is indeed very much the same 
as ‘[d, $] = id’ for the spatial component A’ — but the sign is wrong in the 
u =v =Q case. We are not out of the maze yet. 

Nevertheless, proceeding onwards on the basis of (7.103), we write the 
quantum mode expansion as (cf (7.88)) 


An ; dk u A —ik-a xu at ikr 
A" (x) > I ana (k, Ga (Je? + (k, AGL (k)e*] (7.104) 


where the sum is over four independent polarization states A = 0, 1,2, 3, since 
all four fields are still in play. Before continuing, we need to say more about 
these €'s (previously, we only had two of them, now we have four and they 
are 4-vectors). We take k to be along the z-direction, as in our discussion of 
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the €'s in section 7.3.1, and choose two transverse polarization vectors as (cf 
(7.81), (7.82)) 


et (k, A = 1) = (0,1,0,0) 


‘transverse polarizations’. (7.105) 
e(k, A = 2) = (0,0, 1,0) 
The other two e's are 
e” (k, A = 0) = (1,0,0,0) ‘time-like polarization’ (7.106) 
and 
e” (k, A = 3) = (0,0,0, 1) ‘longitudinal polarization’. (7.107) 


Making (7.104) consistent with (7.103) then requires 
lân (k), GLEN] = -gax (27)38? (k — k’). (7.108) 


This is where the wrong sign in (7.103) has come back to haunt us: we have 
the wrong sign in (7.108) for the case A = X = 0 (time-like modes). 

What is the consequence of this? It seems natural to assume that the 
vacuum is defined by 


&,(k)|0)=0 forall À = 0,1,2,3. (7.109) 


But suppose we use (7.108) and (7.109) to calculate the normalization overlap 
of a ‘one time-like photon’ state; this is 
(RA =0|k,A=0) = (0ldo(k)44(k’)|0) 
= —(2m)5%6%(k — k’) (7.110) 


and the state effectively has a negative norm (the k = k’ infinity is the stan- 
dard plane-wave artefact). Such states would threaten fundamental properties 
such as the conservation of total probability if they contributed, uncancelled, 
in physical processes. 

At this point we would do well to recall the condition ‘O, Au = 0’, which 
still needs to be taken into account, somehow, and it does indeed save us. 
Gupta (1950) and Bleuler (1950) proposed that, rather than trying (unsuc- 
cessfully) to impose it as an operator condition, one should replace it by the 
weaker condition 

3 AX (o) = 0 (7.111) 


where the (+) signifies the positive frequency part of A, i.e. the part involving 
annihilation operators, and |W) is any physical state (including |0)). From 
(7.111) and its Hermitian conjugate 


(Yð A" (a) = 0 (7.112) 
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we can deduce that the Lorentz condition (7.70) does hold for all expectation 
values: 

(D/O, A" |v) = (ba, A“ + 9, An |b) = 0, (7.113) 
and so the classical limit of this quantization procedure will recover the clas- 
sical Maxwell theory in Lorentz gauge. 

Using (7.104), (7.106) and (7.107) with k” = (|k|,0,0,|k|), condition 
(7.111) becomes 
[âo(k) — á(4)]|w) = 0. (7.114) 


To see the effect of this condition, consider the expression for the Hamiltonian 
of this theory. In normally ordered form, it turns out to be 


‘ dk 
H= / (al ay + âlâ + âlâ — ab ao)w (7.115) 


so the contribution from the time-like modes looks dangerously negative. How- 
ever, for any physical state |V), we have 


(Ul(âjâs — âjâo) lv) = (U|(ahas — áldo)|v) 
= (bat (a3 — ao)|¥) 
= 0, (7.116) 


so that only the transverse modes survive. 

We hope that by now the reader will have at least begun to develop a 
healthy respect for quantum gauge fields — and the non-Abelian versions in 
volume 2 are even worse! The fact is that the canonical approach has a difficult 
time coping with these constrained systems. Indeed, the complete Feynman 
rules in the non-Abelian case were found by an alternative quantization pro- 
cedure (‘path integral’ quantization). This, however, is outside the scope of 
the present volume. The important points for our purposes are as follows. It 
is possible to carry out a consistent quantization in the Gupta-Bleuler for- 
malism, which is the quantum version of the Maxwell theory constrained by 
the Lorentz condition. The propagator for the photon in this theory is 


—ig'” /k? + ie (7.117) 


which is the expected massless limit of the KG propagator as far as the spatial 
components are concerned (the time-like component has that negative sign). 

As in all the other cases we have dealt with so far, the Feynman propagator 
(OIT (Â! (x1) A” (x2))|0) can be evaluated using the expansion (7.104) and the 
commutation relations (7.108). One finds that it is indeed equal to the Fourier 
transform of —ig*”/k? + ie just as asserted in (7.117). For this result, we need 
the ‘pseudo completeness relation’ (problem 7.13) 


—e#(k, A = 0) (k, X= 0) + e*(k,A=1)e"(k,A= 1) 
+ e” (k, A = 2)e"(k, A = 2) + e” (k, A = 3)e (k, A = 3) = =g”. 
(7.118) 
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We call this a pseudo completeness relation because of the minus sign appear- 
ing in the first term: its origin in the evaluation of this vev is precisely the 
‘wrong sign commutator’ for the @ mode, (7.108). 

Thus the gauge choice (7.70) can be made to work in quantum field theory 
via the condition (7.111). But other choices are possible too. In particular, a 
useful generalization of the Lagrangian (7.97) is 


ol 


Le 4 


Fr F” — ¿OLA (7.119) 


where € is a constant, the ‘gauge parameter’. Le leads to the equation of 
motion (problem 7.14) 


( Juv Oo + +040.) AY =0. (7.120) 


In momentum-space this becomes (problem 7.14) 


( ke guy + kyky chulo) AY =0. (7.121) 


The inverse of the matrix acting on A” exists, and gives us the more general 
photon propagator (or Green function) 


: v v 2 
k? + ie 
as shown in problem 7.14. The previous case is recovered as £ — 1. Confus- 
ingly, the choice € = 1 is often called the ‘Feynman gauge’, though in classical 
terms it corresponds to the Lorentz gauge choice. For some purposes the ‘Lan- 
dau gauge’ € = 0 (which is well defined in (7.122)) is convenient. In any event, 
it is important to be clear that the photon propagator depends on the choice 
of gauge. Formula (7.122) is the photon analogue of ‘rule (ii)’ in (6.103). 

This may seem to imply that when we use the photon propagator (7.122) 
in Feynman amplitudes we will not get a definite answer, but rather one 
that depends on the arbitrary parameter €. This is a serious worry. But the 
propagator is not by itself a physical quantity — it is only one part of a physical 
amplitude. In the following chapter we shall derive the amplitudes for some 
simple processes in scalar and spinor electrodynamics, and one can verify that 
they are gauge invariant — either in the sense (for external photons) of being 
invariant under the replacement (7.76), or (in the case of internal photons) of 
being independent of €. It can be shown (Weinberg 1995, section 10.5) that 
at a given order in perturbation theory the sum of all diagrams contributing 
to the S-matrix is gauge invariant. 
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7.4 Introduction of electromagnetic interactions 


After all these preliminaries, the job of introducing the first of our gauge 
field interactions, namely electromagnetism, into our non-interacting theory 
of complex scalar fields, and of Dirac fields, is very easy. From our discussion 
in chapter 2, we have a strong indication of how to introduce electromagnetic 
interactions into our theories. The ‘gauge principle’ in quantum mechanics 
consisted in elevating a global (space-time-independent) U(1) phase invariance 
into a local (space-time-dependent) U(1) invariance — the compensating fields 
being then identified with the electromagnetic ones. In quantum field theory, 
exactly the same principle exists and leads to the form of the electromagnetic 
interactions. Indeed, in the field theory formalism we have a true local U(1) 
phase (gauge) invariance of the Lagrangian (rather than a gauge covariance 
of a wave equation) and we shall be able to exhibit explicitly the symmetry 
current, and symmetry operator, associated with the U(1) invariance — and 
identify them precisely with the electromagnetic current and charge. 

We have seen that for both the complex scalar and the Dirac fields the 
free Lagrangian is invariant under U(1) transformations (see (7.22) and (7.48)) 
which, we once again emphasize, are global. Let us therefore promote these 
global invariances into local ones in the way learned in chapter 2 — namely by 
invoking the ‘gauge principle’ replacement 


OY => DY = Oh + ig AX (7.123) 


for a particle of charge q, this time written in terms of the quantum field An. 
In the case of the Dirac Lagrangian 


Lp = v(in"d, — m)j (7.124) 


we expect to be able to ‘promote’ it to one which is invariant under the local 
U(1) phase transformation! 


b(@,t) > $ (mt) = MOOG, t) (7.125) 


provided we make the replacement (7.123) and demand that the (quantized) 
4-vector potential transforms as (cf (2.15) with the sign change for $) 


Â! — Al = Â! + Arg, (7.126) 


Thus the locally U(1)-invariant Dirac Lagrangian is expected to be 


Lp local = Win" Dy = m). (7.127) 


‘Note that the classical field x(a, t) of (2.34) has become a quantum field X(z,t) in 
(7.125); the sign change of X compared with x is conventional in qft. 
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The invariance of (7.127) under (7.125) is easy to check, using the crucial 
property (2.43), which clearly carries over to the quantum field case: 


Dip! = eR (Dy). (7.128) 
Equation (7.128) implies at once that 
(DY, — mb = eR", — md, (7.129) 


while taking the conjugate of (7.125) yields 


a= wel, (7.130) 

Thus we have 
DD md = dele (i"D, —myb (7131) 
= iD, — mjd (7.132) 


and the invariance is proved. 
The Lagrangian has therefore gained an interaction term 


Lp => Lp local = Lp + Lint (7.133) 


where — 
Lint = qý" Âp. (7.134) 


Since the addition of Lint has not changed the canonical momenta, the Hamil- 
tonian then becomes H = Hp + Hp, where 


Hp = -Ĝin = by" Â, = qth Âo — qh aj A (7.135) 


which is the field theory analogue of the potential in (3.102). It has the 
expected form ‘pAg — j-A’ if we identify the electromagnetic charge density 
operator with gute (the charge times the number density operator) and the 
electromagnetic current density operator with avia. The electromagnetic 
4-vector current operator J£, is thus identified as 


IEn = aye, (7.136) 


which is gauge invariant and a Lorentz 4-vector. The Lagrangian (7.134) is 
manifestly Lorentz invariant. Ă 
We now note that J% is just q times the symmetry current Ny of sec- 


tion 7.2 (see equation (7.50)). Conservation of 34, would follow from global 
U(1) invariance alone (i.e. Y a constant in equation (7.125)); but many La- 
grangians, including interactions, could be constructed obeying this global 
U(1) invariance. The force of the local U(1) invariance requirement is that it 
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FIGURE 7.3 _ 
Possible basic ‘vertices’ associated with the interaction density epyupA,,; 
these cannot occur as physical processes due to energy-momentum con- 
straints. 


has specified a unique form of the interaction (i.e. Lint of equation (7. 134)). 
Indeed, this is just A so that in this type of theory the current ¿4,, is 
not only a symmetry current, but also determines the precise way in which a 
vector potential A“ couples to the matter field 4. Adding the Lagrangian for 
the A” field then completes the theory of a charged fermion field interacting 
with the Maxwell field. In a general gauge, the A" field Lagrangian is the 
operator form of (7.119), Le. 

The interaction term HE = avyvA, is a ‘three-fields-at-a-point’ kind of 
interaction just like our 3-scalar interaction go adpec in chapter 6. We know, 
by now, exactly what all the operators in Ht are capable of: some of the 
possible emission and absorption processes are shown in figure 7.3. Unlike the 
‘ABC’ model with mc > ma + mp however, none of these elementary ‘vertex’ 
processes can occur as a real physical process, because all are forbidden by 
the requirement of overall 4-momentum conservation. However, they will of 
course contribute as virtual transitions when ‘paired up’ to form Feynman 
diagrams, such as those in figure 7.4 (compare figures 6.4 and 6.5). 

It is worth remarking on the fact that the ‘coupling constant’ q is dimen- 
sionless, in our units. Of course, we know this from its identification with the 
electromagnetic charge in this case (see appendix C). But it is instructive to 
check it as follows. A Lagrangian density has mass dimension M+, since the 
action is dimensionless (with îi = 1). Referring then to (7.33) we see that the 
(mass) dimension of the 1) field is M3/2, while (7.67) shows that that of A“ 


is M. It follows that pypa, has mass dimension M*, and hence q must be 
dimensionless. 

The application of the Dyson formalism of chapter 6 to fermions interacting 
via Ht, leads directly to the Feynman rules for associating precise mathemat- 
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FIGURE 7.4 
Lowest-order contributions to ye” — ye”. 


ical formulae with diagrams such as those in figure 7.4, as usual. This will 
be presented in the following chapter: see comment (3) in section 8.3.1 and 


appendix L. We may simply note here that a “q appears along with a “y in 


Hp, so that the process of ‘contraction’ (cf chapter 6) will lead to the form 
(OT (b(x21)6(x2))10) of the Dirac propagator, as stated in section 7.2. 

In the same way, the global U(1) invariance (7.22) of the complex scalar 
field may be generalized to a local U(1) invariance incorporating electromag- 
netism. We have 


bua > Lee + ae (7.137) 
where _ _ _ e 
Lia = 9,9196 — mh e (7.138) 
and (under ô, > D,) 
Lim = iqt — (0"6')6) A, + Â" A, td (7.139) 


which is the field theory analogue of the interaction in (3.100). The electro- 
magnetic current is 
Jom = —0L 104 /OA, (7.140) 


as before, which from (7.139) is 
jim = ig(o'O"d — (O Ge) — 2P AM Gt. (7.141) 


We note that for the boson case the electromagnetic current is not just q 
times the (number) current No appropriate to the global phase invariance. 
This has its origin in the fact that the boson current involves a derivative, 
and so the gauge invariant boson current must develop a term involving At 
itself, as is evident in (7.141), and as we also saw in the wavefunction case 
(cf equation (2.40)). The full scalar QED Lagrangian is completed by the 
inclusion of Le as before. 
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The application of the formalism of chapter 6 is not completely straight- 
forward in this scalar case. The problem is that Pia oi (7.139) involves deriva- 
tives of the fields and, in particular, their time derivatives. Hence the canoni- 
cal momenta will be changed from their non-interacting forms. This, in turn, 
implies that the additional (interaction) term in the Hamiltonian is not just 
=f ais as in the Dirac case, but is given by (problem 7.15) 


H's = -int — MANE. (7.142) 


The problem here is that the Hamiltonian and = at differ by a term which is 
non-covariant (only Â? appears).This seems to threaten the whole approach 
of chapter 6. Fortunately, another subtlety rescues the situation. There is 
a second source of non-covariance arising from the time-ordering of terms 
involving time derivatives, which will occur when (7.142) is used in the Dyson 
series (6.42). In particular, one can show (problem 7.16) that 


(0|T (01, H(21)02,6'(22))/0) 
= 01,02 (0/T(ó(21)0!(22))10) — iguoguod (a — 22) (7.143) 
which also exhibits a non-covariant piece. A careful analysis (Itzykson and 
Zuber 1980, section 6.1.4) shows that the two covariant effects exactly com- 


pensate, so that in the Dyson series we may use Äi = = int after all. The 
Feynman rules for charged scalar electrodynamics are given in appendix L. 


E RH —————— ooo ———— 


7.5 P,C and T in quantum field theory 


We end this chapter by completing the discussion of the discrete symmetries 
which we began in section 4.2, extending it from the single particle (wave- 
function) theory to quantum fields. We begin with the parity transformation. 


7.5.1 Parity 


The algebraic manipulations of section 4.2.1 apply equally well to the equa- 
tions of motion for the quantum field, and we can take over the results by 
replacing a transformed wavefunction such as wp (x, t) by the corresponding 
transformed field p (æ, t) = Py)(a,t)P~! where P is a unitary quantum field 
operator (which we shall not need to calculate explicitly). Thus we have 


op(a,t) = ¢(—a,t) (7.144) 
for the KG and Dirac fields, and 
Ap(a,t) =—A(-a,t), AQ(a,t) = A°(—a, t) (7.146) 
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for the electromagnetic fields. In (7.144) - (7.146) a simple choice of phase 
factor has been made. 

There is however one new feature in the quantum field case, which is that 
the commutation or anticommutation relations must be left unchanged by 
the transformation, if it is to be an invariance of the theory. Evidently for P 
the only non-trivial case is the Dirac field, and it is easy to check that the 
anticommutation relations (7.44) and (7.45) are invariant under (7.145). 

Let us see the effect of P on the free particle expansion (7.35). Equation 
(7.145) becomes 

; dk DA 5-1 —iwt+ik-x 
wp (a, t) / a ZEN u(k, s)e 


E Pd! (k )P- B lu(k, s)e iwt—ik. T] 
d —iwt-iK-L 
DA k)Bu(k, s)e k 


s=1,2 
+ di(k)Bu(k, s)eittik-2) (7.147) 
Changing k to —k in the second integral and using the spinor properties 
Bu((w,—-k),s) =u(k,s),  Bv((w,—k), s) = —v(k, s) (7.148) 
in the right hand side of (7.147), we obtain the conditions 
Pe (k) PTI = @(w,—k), Pdi (k)P~! = —dt(w, —k) (7.149) 


with similar ones for él and ds. Since él creates a fermion from the vacuum and 
di creates its antiparticle, it follows that a fermion and its antiparticle have 
opposite intrinsic parities. Similarly, equation (7.146) shows, when applied 
to the expansion (7.104), that a physical (transverse) photon has negative 
intrinsic parity. 

Turning now to the electromagnetic interaction, it is clear that j,(a) = 
ablx)y"b(x) has exactly the same transformation properties under P as 
py"p(x) had — namely J0,,(u) is a scalar and j,,,(a) is a polar vector. Since 
this is also the way A“ transforms, according to (7.146), it follows that the 
interaction =i A, is parity invariant, as we expect for QED. The scalar 
interaction (7.139) is also parity invariant. 


7.5.2 Charge conjugation 


The discussion of C proceeds similarly, the transformation being represented 
by a unitary quantum field operator C such that 


C¢Cc! = şi (7.150) 
Cp TI = iţi? (7.151) 
ê Â! C- = -Â (7.152) 
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in the three cases of interest. Note that in terms of the decomposition (7.15) 
of the complex field ¢ into the two real fields ¢; and 2, (7.150) reads 


C($1 —ió2)C"* = by + ide. (7.153) 


The reader may check (problem 7.17(a)) that the Dirac field anticommutation 
relations are invariant under (7.151). 
Applying (7.150) to the free field expansion (7.16), we easily find 


Ca(k)C-! = b(k), COREL = ât (k), (7.154) 


so that particle and antiparticle operators are interchanged. The conditions 
(7.154) are of course consistent with (7.153). It follows that the normally 
ordered É of (7.21) is even under C, while the normally ordered number 
density (7.24) is odd — the ordering being with Bose commutation relations. 
Carrying out the same steps for the Dirac field, and using the spinor relations 
(4.95) and (4.96), we obtain 


Cê, (k)! =d,(k), di (k)! = ét(k); (7.155) 


particle and antiparticle operators are again interchanged. We particularly 
note that the Dirac Hamiltonian (7.55) is even under C, while the Dirac 
number operator (7.54) is odd, in both cases after normal ordering with an- 
ticommutation relations (Fermi statistics). The reader may check (problem 


7.17(b)) that the electromagnetic current density qu(x)y“b(x) is odd under 
C, when normally ordered, and so the interaction — GE, Au is C-invariant. The 
same is true for the KG case, after normal ordering using Bose statistics. 

In section 4.2.2 we introduced self-conjugate (Majorana) spinors. In ex- 
tending that discussion to quantum field theory, it is again convenient to use 
the alternative representation (3.40) for the Dirac matrices, since we can then 
read off the Lorentz transformation properties from the results of section 4.1.2. 
Consider the 4-component Majorana field 


> —ica gT (x) 
dulo) = ( 20) ) (7.156) 


It is easy to check from (4.19) and (4.42) that the quantity 02x* (x) transforms 
like a g-type spinor, and so the construction (7.156) is consistent with Lorentz 
covariance. The C-conjugate field is 


fa E = 0 —id9 —i02X(x) = A 

note) == o, T (RG?) = bute, (1:57) 
showing that it is self-conjugate. It is clear that the Majorana field has only 
two independent degrees of freedom — those in (x) — in contrast to the Dirac 
field which has four (we could of course have equally well constructed a Ma- 
jorana field using a ¢-type spinor field instead of a x-type one). The latter 
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corresponds physically to fermion and antifermion, spin up and down, but 
the Majorana fermion is the same as its antiparticle. The free field expansion 
corresponding to (7.35) for a Majorana field is 


A dk ” —ik-x A ik-x 
(a= f Enya A be 4 at (iuli, A)e**]. (7.158) 


The Lagrangian for a free Majorana field may be taken to be dy Gd — 
m)wWm, which the reader can rewrite in terms of ¥. For example, the mass 
term is 


—minytu = -m%Tioz + Hermitian conjugate. (7.159) 


We note that this expression will vanish unless the components X1 andxXa 
anticommute with each other. 


7.5.3 Time reversal 


In section 4.2.4 we found that the time reversal transformation for the single 
particle theories was not represented by a unitary operator, but rather by the 
product of a unitary operator and the complex conjugation operator. We can 
see that the same must be true in quantum field theory by considering the 
equation of motion (6.18) for a scalar field (for simplicity), in the interaction 
picture: 


ial = ilfo, d(a, t). (7.160) 


Suppose the field r in the time reversed frame were related to $ by a uni- 
tary quantum field operator Ur so that (suppressing the spatial argument) 
Uró(t)ÚUl, = pr (t”). Then applying Ur.. Uh to equation (7.160) we would 


obtain _ 
ea) O) (7.161) 
or equivalently g 
A AN (7.162) 


To restore (7.162) to the form (7.160) — i.e. for covariance to hold — would 
require that Ur transforms Ho to — Ho. But this is unacceptable on physical 
grounds, because the eigenvalues of Hy must be positive relative to the vac- 
uum, both before and after the transformation. We must therefore write the 


transformation as _ : 
T = UrK (7.163) 


where, as in section 4.2.4, K takes the complex conjugate of ordinary numbers 
and functions (i.e. it replaces i by -i). The operator Ur depends on the field 
involved, but we shall not need to exhibit it explicitly. 
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We must now decide how the fields transform under T. We can be guided 
by our work in section 4.2.4 in the single particle theory, remembering that a 
wavefunction is the vacuum to one particle matrix element of the correspond- 
ing quantum field operator (see Comment (5) in section 5.2.5), and also that 
matrix elements of operators and their time-reversed transforms are related 
by (4.126). In the case of the KG field, for example, let us take in (4.126) 
< 1» | =< 0), Ô = (a), and |y >= |a;p > for the state of one ‘a’ particle 
with 4-momentum p. Then (4.126) gives 


(x) =< Oló(x)la; E, p >=< 0 Toa) Í la; E, -p >, (7.164) 


where ¢(x) is the free particle solution exp(—iEt + ip - 2)/(2E)'/?. Now in 
section 4.2.4 we found the result br (x,t) = ¢*(a,—t), for the time-reversed 
solution. This will be consistent with (7.164) if we take, in the quantum field 


case, E A . 
To(z, HT: = olz, =t), (7.165) 


assuming that the vacuum is invariant. Applying (7.165) to the free field 
expansion (4.5) gives 


Td(a,t)T-! = 
Orar + Ûrb' (kÔ eciottik-2] (7.166) 
T W 
= Gl, —t) = J oseke + bt(k)eit-ik-@) (7.167) 
T W 


Note that the plane wave functions have been complex conjugated in (7.166), 
because T contains K. Changing k to —k in the integral in (7.167), we obtain 
the conditions 


Ura(w,k)UL = â(w,—k), Ûrb (w, UL, = bt (w, —k). (7.168) 


The transformation preserves particle and antiparticle, and reverses the 3- 
momentum in the creation and annihilation operators. 
For the Dirac theory, we take, similarly, 


Ty(a, ÎL =i0,030(2,—t) (7.169) 


as suggested by (4.118). The reader may check that the anticommutation 
relations are left invariant by (7.169). Applying (7.169) to the free field ex- 
pansion (7.35), and taking the spinors to be helicity eigenstates as in section 
4.2.5, we obtain the conditions 


Ûrô (w, k)UL =¿x(w,—k), Úrdi(w,k)JUL = di (w,—k). (7.170) 


Once again, the 3-momentum has been reversed in the creation and annihila- 
tion operators. 
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Let us check the behaviour of the current density je, (a) = gh(x)y"0b (x) 
under the transformation (7.169). Recalling that in the standard representa- 
tion iaja3 = Ye, we find 


ONCA = Fem (2, —t) 
Tion(2,t)T = qUi(e,-t)D:0"Y2p(2, -t) = —Jon(w,—-t). (7.171) 


This is exactly how A” (x), and hence Â” (x), transforms, and hence the elec- 
tromagnetic interaction = jt Ân is T-invariant. The same is true in the KG 
case. 

We may now proceed to look at some simple processes in scalar and spinor 
electrodynamics, in the following two chapters. 


E: SSe 


Problems 


7.1 Verify that the Lagrangian £ of (7.1) is invariant (i.e. £(¢1, 2) = ÊC, , 64) 
under the transformation (7.2) of the fields (61, ¢2) > (1, 04). 


7.2 


(a) Verify that, for NY given by (7.23), the corresponding Ng of (7.14) 
reduces to the form (7.24); and that, with H given by (7.21), 


[Ña, H] = 0. 


(b) Verify equation (7.27). 
7.3 Show that 


[A(a1), $t (£2) =0 for (a1 — z2)? < 0 


(Hint: insert expression (7.16) for the ¢’s and use the commutation rela- 
tions (7.18) to express the commutator as the difference of two integrals; in 
the second integral, zi — £2 can be transformed to —(a1 — x2) by a Lorentz 
transformation — the time-ordering of space-like separated events is frame- 
dependent! ]. 


7.4 Verify that varying Vi in the action principle with Lagrangian (7.34) gives 
the Dirac equation. 


7.5 Verify (7.44). 
7.6 Verify equations (7.52) and (7.53). 
7.7 Verify (7.62). 
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7.8 Verify the expression given in (7.64) for X ulk, sju(k, s). [Hint: first, 


Li 
note that u is a four-component Dirac spinor arranged as a column, while u 
is another four-component spinor but this time arranged as a row because of 
the transpose in the Y symbol. So ‘wii’ has the form 


ul (a u U3 U4 ) uu uu 
= | 4241 U2tU2 


UA 


Verify that 
1 0 
1 alt 242} _ 
digit + fo = a 


Similarly, verify the expression for 5 v(k, s)u(k, s). 
7.9 Verify the result quoted in (7.63) for the Feynman propagator for the 
Dirac field. 


7.10 Verify that if £ = — PE Ph — jh, Ayu, Where Fy, = 0, A, — 0,Ay, the 


Euler-Lagrange equations for A,, yield the Maxwell form 


Al — OH (ð, A”) = jy. 


[Hint: it is helpful to use antisymmetry of Fy to rewrite the ‘F - F” term as 
1 H Av 
—>5FuwO"' A”. 


(a) Show that the Fourier transform of the free-field equation for A, 
(i.e. the one in the previous question with ji, set to zero) is given 
by (7.90). 


(b) Verify (7.94). 


7.12 Show that the equation of motion for A,,, following from the Lagrangian 
Ly of (7.97) is 


Ar =0. 


7.13 Verify equation (7.118). 
7.14 Verify equations (7.120), (7.121) and (7.122). 


7.15 Verify the form (7.142) of the interaction Hamiltonian, Hg, in charged 
spin-0 electrodynamics. 
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7.16 Verify equation (7.143). 
7.17 


(a) Check that the anticommutation relations (7.44) and (7.45) are left 
invariant under (7.151). 


(b) Check that the Dirac electromagnetic current density Haba) is 
odd under C when normally ordered. [Hint: the normally ordered 


current can be written as iio), yb(2)].] 
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Plate I 


Distributions of x times the unpolarized parton distribution functions f(x) 
(where f = uy,dy,u,d,s,c,b,g) and their associated uncertainties using the 
MSTW2008 parametrization (Martin et al. 2009) at a scale 1? = 10 GeV? 
and u? = 10,000 GeV. [Figure reproduced courtesy Michael Barnett, for the 
Particle Data Group, from the review of Structure Functions by B F Foster, 
A D Martin and M G Vincter, section 16 in the Review of Particle Physics, 
K Nakamura et al.(Particle Data Group) Journal of Physics G 37 (2010) 
075021, IOP Publishing Limited.] (See figure 9.9 on page 283.) 
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Plate II 

The cross section o for the annihilation process ete” — hadrons, and the 
ratio R (see equation (9.100)), as a function of cm energy. [Figure reproduced 
courtesy Michael Barnett, for the Particle Data Group, from the Review of 
Particle Physics, K Nakamura et al. (Particle Data Group) Journal of Physics 
G 37 (2010) 075021 IOP Publishing Limited.] (See figure 9.16 on page 290.) 
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8 


Elementary Processes in Scalar and Spinor 
Electrodynamics 


8.1 Coulomb scattering of charged spin-0 particles 


We begin our study of electromagnetic interactions by considering the sim- 
plest case, that of the scattering of a (hypothetical) positively charged spin-0 
particle ‘st’ by a fixed Coulomb potential, treated as a classical field. This 
will lead us to the relativistic generalization of the Rutherford formula for 
the cross section. We shall use this example as an exercise to gain familiarity 
with the quantum field-theoretic approach of chapter 6, since it can also be 
done straightforwardly using the ‘wavefunction’ approach familiar from non- 
relativistic quantum mechanics, when supplemented by the work of chapter 3. 
We shall also look at ‘s~’ Coulomb scattering, to test the antiparticle prescrip- 
tions of chapter 3. Incidentally, we call these scalar particles s* to emphasize 
that they are not to be identified with, for instance, the physical pions 7+, 
since the latter are composite (qq) systems, and hence their interactions are 
more complicated than those of our hypothetical ‘point-like’ s* (as we shall 
see in section 8.4). No point-like charged scalar particles have been discovered, 
as yet. 


8.1.1 Coulomb scattering of st (wavefunction approach) 


Consider the scattering of a spin-0 particle of charge e and mass M, the ‘st’, in 
an electromagnetic field described by the classical potential A“. The process 
we are considering is 

s*(p) > s*(p') (8.1) 


as shown in figure 8.1, where p and p’ are the initial and final 4-momenta 
respectively. The appropriate potential for use in the KG equation has been 
given in section 3.5: 


Va = ie(ð AY + A¥O,,) — e2A2. (8.2) 


As we shall see in more detail as we go along, the parameter characterizing 
each order of perturbation theory based on this potential is found to be e?/47. 
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FIGURE 8.1 
Coulomb scattering of s*. 


In natural units (see appendices B and C) this has the value 


1 
2 Y: == 
a = e“ /4n = 137 (8.3) 
for the elementary charge e. a is called the fine structure constant. The small- 
ness of a is the reason why a perturbation approach has been very successful 
for QED. 

To lowest order in a we can neglect the e?A? term and the perturbing 

potential is then 
V =ie(0,,A" + AH9,). (8.4) 


For a scattering process we shall assume! the same formula for the transition 
amplitude as in non-relativistic quantum mechanics (NRQM) time-dependent 
perturbation theory (see appendix A, equations (A.23) and (A.24)): 


As =-i J dro Vo (8.5) 


where ¢ and ¢’ are the initial and final state free-particle solutions. The latter 
are (recall equation (3.11)) 


(0) Ne iP 
do = Nei (8.7) 


and we shall fix the normalization factors later. Inserting the expression for 
V into (8.5), and doing some integration by parts (problem 8.1), we obtain 


Ast =-i f dz {ielg"* (0,9) — (0.0 *)e] A". (8.8) 


The expression inside the braces is very reminiscent of the probability current 
expression (3.20). Indeed we can write (8.8) as 


lei J diz jt (2) Alo) (8.9) 


l Justification may be found in chapter 9 of Bjorken and Drell (1964). 
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where 
Jl, (a) =ie(8 "01H — (A"8!")¢) (8.10) 


can be regarded as an electromagnetic ‘transition current’, analogous to the 
simple probability current for a single state. In the following section we shall 
see the exact meaning of this idea, using quantum field theory. Meanwhile, 
we insert the plane-wave free-particle solutions (8.6) and (8.7) for @ and 9 
into (8.10) to obtain 


je gt (2) = NN'e(p +p! e PP) # (8.11) 


so that (8.9) becomes 
Ast = —iNN’ J diz e(p + p’) e “iPP? AH (x). (8.12) 


In the case of Coulomb scattering from a static point charge Ze (e > 0), 
the vector potential A! is given by 
Z 
A = A=0. (8.13) 


— Are 
Inserting (8.13) into (8.12) we obtain 


i(p-p)-a 
As = ANN ZE E) | eee at | ae e (8.14) 
T 


The initial and final 4-momenta are 
p=(E,p) p=(E',p) 


with E = /M?+p?,E' = VM2+p'?. The first (time) integral in (8.14) 
gives an energy-conserving 6-function 2r0(E — E) (see appendix E), as is 
expected for a static (non-recoiling) scattering centre. The second (spatial) 
integral is the Fourier transform of 1/41 |x|, which can be obtained from (1.13), 
(1.26) and (1.27) by setting my = 0; the result is 1/q? where q = p—p’. Hence 


Ze? 


As+ —iN N'2n6(E — E (8.15) 


=  —i0m)6(E — ENV,+ (cf equation (A.25)) (8.16) 


where in (8.15) we have used E = E” in the matrix element. This is in the 
standard form met in time-dependent perturbation theory (cf equations (A.25) 
and (A.26)). 
The transition probability per unit time is then (appendix H, equation 
(H.18)) 
Py = 2m/Vo+|?*p(E”) (8.17) 
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where p(E”) is the density of final states per energy interval dE’. This will 
depend on the normalization adopted for ¢,¢@’ via the factors N,N’. We 
choose these to be unity, which means that we are adopting the ‘covariant’ 
normalization of 2E particles per unit volume. Then (cf equation (H.22)) 


112 d|p’| 
Baz = IP | 
Asia (27)5 2B" eae) 
Using E! = (M? + p'?)1/2 one easily finds 
da 
manta 
p(B) = ES (8.19) 


Note that this differs from equation (H.22) since here we are using relativistic 
kinematics. 

To obtain the cross section, we need to divide P,+ by the incident flux, 
which is 2|p| in our normalization. Hence 


do = (4Z7e4 E? /162q*) dí. (8.20) 


Finally, since q? = (p — p') = 4|p|? sin? 6/2 (cf section 1.3.4) where 0 is the 
angle between p and p’, we obtain 


do y E* 1 

= (Za) Tipli sin? 8/2 (8.21) 
This is the Rutherford formula with relativistic kinematics, showing the char- 
acteristic sin”? 9/2 angular dependence (cf figure 1.8). This deservedly famous 
formula will serve as a ‘reference point’ for all the subsequent calculations in 
this chapter, as we proceed to add in various complications, such as spin, re- 
coil and structure. The non-relativistic form may be retrieved by replacing E 
by M. 


8.1.2 Coulomb scattering of st (field-theoretic approach) 


We follow steps closely similar to those in section 6.3.1, making use of the 
result quoted in section 7.4, that the appropriate interaction Hamiltonian for 
use in the Dyson series (6.42) is A! = —Lint where Lint is given by (7.139), 
with q =e. As in the step from (8.2) to (8.4) we discard the e? term to first 
order and use 


Hi (x) = ie(d" (x)O" d(x) — (O $ ())4(z)) Ay (a). (8.22) 


Equation (8.22) can be written as JEn sAn where 


jing = iela" g — (9"6i)6). (8.23) 
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Note that the field A, is not quantized: it is being treated as an ‘external’ 
classical potential. The expansion for the field ¢ is given in (7.16). As in 
(6.48), the lowest-order amplitude is 


As = —ilst, p!| I dêz Hi(x)|s*, p) (8.24) 
where (cf (6.49)) 


Ist, p) = V2Eâi(p)|0). (8.25) 


We are, of course, anticipating in our notation that (8.24) will indeed be the 
same as (8.12). The required amplitude is then 


Ag+ =~ -i f ate (st, plita s()|s*, p) Ap (2). (8.26) 


Using the expansion (7.16), the definition (8.25) and the vacuum conditions 
(7.30), and following the method of section 6.3.1, it is a good exercise to check 
that the value of the matrix element in (8.26) is (problem 8.2) 


aaa =p pr (8.27) 


This is exactly the same as the expression we obtained in (8.11) for the wave 
mechanical transition current in this case, using the normalization N = N’ = 
1, which is consistent with the field-theoretic normalization in (8.25). Thus 
our wave mechanical transition current is indeed the matrix element of the 
field-theoretical electromagnetic current operator: 


Fem s+ (2) = (8*, Plita s(0)l5t,p). (8.28) 


Combining all these results, we have therefore connected the ‘wavefunction’ 
amplitude and the ‘field-theory’ amplitude via 


Ay = if tatele) Ant) 
= =i f ata (t, Pith, 8,7) Aula). (8.29) 
We note that because of the static nature of the potential, and the non- 


covariant choice of A“ (only A0 4 0), our answer in either case cannot be 
expected to yield a Lorentz invariant amplitude. 


8.1.3 Coulomb scattering of s 
The physical process is (figure 8.2(a)) 


s (p) => s` (p’) (8.30) 
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Sas sip? Sa gt ES 
a xX Be A Uz 
p ae p' =p a -p' 
(a) (b) 


FIGURE 8.2 

Coulomb scattering of s~: (a) the physical process with antiparticles of pos- 
itive 4-momentum, and (b) the related unphysical process with particles of 
negative 4-momentum, using the Feynman prescription. 


where, of course, E and E’ are both positive (E = (M? + p?)*/? and similarly 
for E”). Since the charge on the antiparticle s~ is —e, the amplitude for this 
process can, in fact, be immediately obtained from (8.12) by merely changing 
the sign of e. Because of the way e and the 4-momenta p and p’ enter (8.12), 
however, this in turn is the same as letting p > —p' and p' > —p: this 
changes the sign of the ‘e(p+p’),,’ part as required, and leaves the exponential 
unchanged. Hence we see in action here (admittedly in a very simple example) 
the Feynman interpretation of the negative 4-momentum solutions, described 
in section 3.4.4: the amplitude for s” (p) > s” (p”) is the same as the amplitude 
for st(—p’) > st(—p). The latter process is shown in figure 8.2(b). 


The same conclusion can be derived from the field-theory formalism. In 
this case we need to evaluate the matrix element 


(57,2 TO (8.31) 


where the same Jem.s of equation (8.23) enters: ¢ of (7.16) contains the an- 
tiparticle operator too! It is again a good exercise to check, using 


Is” ,p) = V2E îi(p)l0) (8.32) 


and remembering to normally order the operators in TE that (8.31) is given 
by the expected result, namely, (8.27) with e > —e (problem 8.3). 


Since the matrix elements only differ by a sign, the cross sections for s* 
and s” Coulomb scattering will be the same to this (lowest) order in a. 
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FIGURE 8.3 
Coulomb scattering of e”. 


8.2 Coulomb scattering of charged spin-> particles 


8.2.1 Coulomb scattering of e” (wavefunction approach) 


We shall call the particle an electron, of charge —e(e > 0) and mass m; note 
that by convention it is the negatively charged fermion that is the ‘particle’, 
but the positively charged boson. The process we are considering is (figure 8.3) 


e (k,s) +e (k', s") (8.33) 


where k, s are the 4-momentum and spin of the incident e”, and similarly for 
k!, s', with k = (E, k) and E = (m? + k2)!/2 and similarly for k’. 
The appropriate potential to use in the Dirac equation has been given in 
section 3.5: 
x 0 A 
Do = edi rea: A=—e( f “| 


Pra ae (8.34) 


for a particle of charge —e. This potential is a 4 x 4 matrix and to obtain an 
amplitude in the form of a single complex number, we must use Yt instead of 
y* in the matrix element. The first-order amplitude (figure 8.3) is therefore 


A =i | atata) (8.35) 


where s and s’ label the spin components. The spin labels are necessary 
since the spin configuration may be changed by the interaction. In (8.35), 
Y and y” are free-particle positive-energy solutions of the Dirac equation, 
as in (3.74), with u given by equation (3.73) and normalized to utu = 2E, 
E = (m2 + k?)!/2, 

The Lorentz properties of (8.35) become much clearer if we use the y- 
matrix notation of problem 4.3. For convenience we re-state the definitions 
here: 


Peor (Y= (8.36) 
yi=Ba; (y)?=-1 i=1,2,3. (8.37) 
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The Dirac equation may then be written (problem 4.3) as 
(id — m)y =0 (8.38) 


where the ‘slash’ notation introduced in (7.59) has been used (if = iy“0,). 
Defining y = 17°, (8.35) becomes 


Az 


e | dtz (eb (e)la) Aula) (8.39) 


—i / d'z jfa E (8.40) 


where we have defined an electromagnetic transition current for a negatively 
charged fermion: 


Sem e- (2) = —e'(x)y*¥(2), (8.41) 


exactly analogous to the one for a positively charged boson introduced in 
section 8.1.1. We know from section 4.1.2 that yyy is a 4-vector, showing 
that A,- of (8.40) is Lorentz invariant. 


Inserting free-particle solutions for Y and y in (8.41), we obtain 


Y (x) = —eu(k’, s')yHulk, s)e hk) (8.42) 


Jem,e- 


so that (8.39) becomes 
As = -i fate (eu Pue) A, (x) (8.43) 


where u = u(k, s) and similarly for u’. Note that the w's do not depend on z. 
For the case of the Coulomb potential in equation (8.13), Ae- becomes 


A / Ze? ri 
Ae- = Drâ(E — E ae u (8.44) 


just as in (8.15), where q = k — k' and we have used 4y? = wi. Comparing 
(8.44) with (8.15), we see that (using the covariant normalization N = N’ = 1) 
the amplitude in the spinor case is obtained from that for the scalar case by 
the replacement 2E => w tw and the sign of the amplitude is reversed as 
expected for e~ rather than s* scattering. 

We now have to understand how to define the cross section for particles 
with spin and then how to calculate it. Clearly the cross section is proportional 
to |A.-|?, which involves Jut(k”, s’)u(k, s)|? here. Usually the incident beam 
is unpolarized, which means that it is a random mixture of both spin states 
s (‘up’ or down”). It is important to note that this is an incoherent average, 
in the sense that we average the cross section rather than the amplitude. 
Furthermore, most experiments usually measure only the direction and energy 
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of the scattered electron and are not sensitive to the spin state s'. Thus what 
we wish to calculate, in this case, is the unpolarized cross section defined by 
do = 4 (dot + dos, + dop + doy) 


= 3) doy, (8.45) 


where dogs x |u'(k’, s’)u(k,s)|?. In (8.45), we are averaging over the two 
possible initial spin polarizations and summing over the final spin states arising 
from each initial spin state. 

It is possible to calculate the quantity 


S= 35 uta? (8.46) 


by brute force, using (3.73) and taking the two-component spinors to be, say, 


= (a) pal (8.47) 


One finds (problem 8.4) 
S = (2E)? (1 — v? sin? 0/2) (8.48) 


where v = |k|/E is the particle’s speed and 0 is the scattering angle. If we 
now recall that (i) the matrix element (8.44) can be obtained from (8.15) by 
the replacement ‘2E — u''w and (ii) the normalization of our spinor states 
is the same (‘p = 2E”) as in the scalar case, so that the flux and density of 


states factors are unchanged, we may infer from (8.21) that 


o E? (1—u? sin? 6/2) 


(8.49) 


4|k|*  sin*9/2 


This is the Mott cross section (Mott 1929). Comparing this with the basic 
Rutherford formula (8.21), we see that the factor (1—v? sin? 9/2) (which comes 
from the spin summation) represents the effect of replacing spin-0 scattering 
particles by spin-3 ones. 

Indeed, this factor has an important physical interpretation. Consider the 
extreme relativistic limit (v => 1,m > 0), when the factor becomes cos? 0/2, 
which vanishes in the backward direction 0 = 7. This may be understood as 
follows. In the m — 0 limit, it is appropriate to use the representation (3.40) 
of the Dirac matrices and, in this case equations (4.14) and (4.15) show that 
the Dirac spinor takes the form 

UR 
u= ( de ) (8.50) 
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where ur and uy have positive and negative helicity respectively. The spinor 
part of the matrix element (8.44) then becomes alunul ur, from which it is 
clear that helicity is conserved: the helicity of the u’ spinors equals that of the 
u spinors; in particular there are no helicity mixing terms of the form ul ur or 
ul ur. Consider then an initial state electron with positive helicity, and take 
the z-axis to be along the incident momentum. The z-component of angular 
momentum is then +2. Suppose the electron is scattered through an angle 
of 7. Since helicity is conserved, the scattered electron's helicity will still be 
positive, but since the direction of its momentum has been reversed, its angular 
momentum along the original axis will be 3. Hence this configuration is 
forbidden by angular momentum conservation — and similarly for an incoming 
negative helicity state. The spin labels s”,s in (8.46) can be taken to be 
helicity labels and so it follows that the quantity S must vanish for 9 = 7 in 
the m — 0 limit. The ‘R’ and ‘L’ states are mixed by a mass term in the Dirac 
equation (see (4.14) and (4.15)) and hence we expect backward scattering to 
be increasingly allowed as m/E increases (recall that v = (1 — m2/£2)!/2 so 


that 1 — v? sin? 6/2 = cos? 9/2 + (m?/E?) sin? 0/2). 


8.2.2 Coulomb scattering of e” (field-theoretic approach) 


Once again, the interaction Hamiltonian has been given in section 7.4, namely 
Hp = —epy yAn = JEn eAu (8.51) 


where the current operator ae is just — eyt in this case. The lowest-order 
amplitude is then 


A = is] | dtz ADE) ks) (8.52) 


II 


—i J diz (e7, k', s' je, el(£)|e7, k, 8) Au (£). (8.53) 


With our normalization, and referring to the fermionic expansion (7.35), the 
states are defined by 

e7, k, s) = V2E¢!(k)|0) (8.54) 
and similarly for the final state. We then find (problem 8.5) that the current 
matrix element in (8.53) takes the form 


(eT, k’, 8’ jen e(£)le7, k, s) = e qtue iE) e = je (a) (8.55) 


ii Jem,e- 


exactly as in (8.42). Thus once again, the ‘wavefunction’ and ‘field-theoretic’ 
approaches have been shown to be equivalent, in a simple case. 


8.2.3 Trace techniques for spin summations 


The calculation of cross sections involving fermions rapidly becomes laborious 
following the ‘brute force’ method of section 8.2.1, in which the explicit forms 
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for u and u’? were used. Fortunately we can avoid this by using a powerful 
labour-saving device due to Feynman, in which the y’s come into their own. 
We need to calculate the quantity S given in (8.46). This will turn out to 
be just the first in a series of such objects. With later needs in mind, we shall 
here calculate a more general quantity than (8.46), namely the lepton tensor 


LH (k',k) = 5 a Du Vu(k, s)[u(k”, s’)y’u(k, s)]* (8.56) 


1 a a 
= 23 vie. ys at (Ole, k, s} (e7, k’, s"|Jém (Ole ks)". (8.57) 


Clearly this will be relevant to the more general case in which A” contains 
non-zero spatial components, for example. For our present application, we 
shall need only L%. 

We first note that L*” is correctly called a tensor (a contravariant second- 
rank one, in fact — see appendix D), because the two ‘uyu, uy”u' factors are 
each 4-vectors, as we have seen. (We might worry a little over the complex 
conjugation of the second factor, but this will disappear after the next step.) 
Consider therefore the factor [u(k”, s')y"u(k, s)]*. For each value of the index 
v, this is just a number (the corresponding component of the 4-vector), and 
so it can make no difference if we take its transpose, in a matrix sense (the 
transpose of a 1 x 1 matrix is certainly equal to itself!). In that case the 
complex conjugate becomes the Hermitian conjugate, which is: 


[a(4,s) Yue, sa = ul(k, s)y’tqotu(k’, s") (8.58) 
= ea (8.59) 

since (problem 8.6) 
pate? = (8.60) 


and y? = °t. Thus LH” may be written in the more streamlined form 


"=3) Uk’, sulk, s)a(k, s) ulk’, s") (8.61) 


which is, moreover, evidently the (tensor) product of two 4-vectors. However, 
there is more to this than saving a few symbols. We have seen the expression 


X uk, sJa(k, s) (8.62) 


S 


before! (See (7.64) and problem 7.8.) Thus we can replace the sum (8.62) 
over spin states ‘s’ by the corresponding matrix (k + m): 


"=3 5 talk’, 8’) (7 ap (KE +m) p(y") ysus(k’, s”) (8.63) 
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where we have made the matrix indices explicit, and summation on all repeated 
matriz indices is understood. In particular, note that every matrix index is 
repeated, so that each one is in fact summed over: there are no ‘spare’ indices. 
Now, since we can reorder matrix elements as we wish, we can bring the us 
to the front of the expression, and use the same trick to perform the second 
spin sum: 


Dust Nūalk’, s") = (K + m)sa. (8.64) 


Thus LI!” takes the form of a matrix product, summed over the diagonal 
elements: 


7 


LY = IP + mal ag + m)ga) (8.65) 
= 3 IE + m) (E+ mys (8.66) 
ô 


where we have explicitly reinstated the sum over 6. The right-hand side of 
(8.66) is the trace (i.e. the sum of the diagonal elements) of the matrix formed 
by the product of the four indicated matrices: 


LP” = ¿TÉ + my" (E + my]. (8.67) 


Such matrix traces have some useful properties which we now list. Denote 
the trace of a matrix A by 


TA =>» Ai. (8.68) 


Consider now the trace of a matrix product, 
= Y Age (8.69) 
i,j 


where we have written the summations in explicitly. We can (as before) freely 
exchange the order of the matrix elements A;; and B;;, to rewrite (8.69) as 


= BjiAij. (8.70) 
i,j 


But the right-hand side is precisely Tr(BA); hence we have shown that 
Tr(AB) = Tr(BA). (8.71) 
Similarly it is easy to show that 
Tr(ABC) = Tr(CAB). (8.72) 


We may now return to (8.67). The advantage of the trace form is that we 
can invoke some powerful results about traces of products of y-matrices. Here 
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we shall just list the trace ‘theorems’ that we shall use to evaluate L”: more 
complete statements of trace theorems and y-matrix algebra, together with 
proofs of these theorems, are given in appendix J . 

We need the following results: 


(i) Tri =4 (8.73) 
(ii) Tr (odd number of y’s) = 0 (8.74) 
(iii) Tr(4p) = 4(a - b) (8.75) 
(iv) Tr(dB¢d) = 4[(a- b)(c-d) + (a - d)(b-c) —(a-c)(b-d)]. (8.76) 
Then 
TP +m) (k+ my = TBy y”) +mTr(y By”) 


+ mTr(K yy’) + me Trophy”) (8.77) 
The terms linear in m are zero by theorem (ii), and using (iii) in the form 
Tr(yuy)a"b” = 4gua*b” = 4a - b (8.78) 
and (iv) in a similar form, we obtain (problem 8.7) 


LH” = IT (E + my (E+ my] = 2[k' k” + k” kt — (k' - kg”) + 2m? gt”. 
(8.79) 
In the present case we simply want L°°, which is found to be (problem 7.9) 


LW = 4E*(1 — v? sin? 9/2) (8.80) 


where v = |k|/E, just as in (8.48). 


8.2.4 Coulomb scattering of et 


The physical process is 

et (k, s) > et (k’, s’) (8.81) 
where, as usual, we emphasize that E and E” are both positive. In the wave- 
function approach, we saw in section 3.4.4. that, because p > 0 always for a 


Dirac particle, we had to introduce a minus sign ‘by hand”, according to the 
rule stated at the end of section 3.4.4. This rule gives us, in the present case, 


amplitude (et (k, s) > et (k’, s’)) 
= —amplitude (e7 (—k’, —s’) > e (—k, —s)). (8.82) 


Referring to (8.43), therefore, the required amplitude for the process (8.81) is 


A+ =i f dia (en(k, s)y”o(k', je), (2) (8.83) 
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since the ‘v’ solutions have been set up precisely to correspond to the ‘—k, —s’ 
situation. In evaluating the cross section from (8.83), the only difference from 
the e~ case is the appearance of the spinors ‘v’ rather than ‘u’; the lepton 
tensor in this case is 


Dr = ¿TY[(K — m — m)y”] (8.84) 


using the result (7.64) for >”, v(k,s)v(k,s). Expression (8.84) differs from 
(8.67) by the sign of m and by k + k’, but the result (8.79) for the trace 
is insensitive to these changes. Thus the positron Coulomb scattering cross 
section is equal to the electron one to lowest order in a. 

In the field-theoretic approach, the same interaction Hamiltonian FI, 
which we used for e” scattering will again automatically yield the e? ma- 
trix element (recall the discussion at the end of section 8.1.3). In place of 
(8.53), the amplitude we wish to calculate is 


Aj = ~i f ate (ct, 1 oil, ola)let hus) 4,0) 
= “i f ata (e+, k’, eea RA (8.85) 
where, referring to the fermionic expansion (7.35), 


let, k, s) = V2Edt (k)|0), (8.86) 


and similarly for the final state. In evaluating the matrix element in (8.85) we 
must again remember to normally order the fields, according to the discussion 
in section 7.2. Bearing this in mind, and inserting the expansion (7.35), one 
finds (problem 8.9) 


(et, k’, s'jene(z)let,k,s) = +ed(k,s)yHu(k’, s'je E)E (8.87) 


just as required in (8.83). Note especially that the correct sign has emerged 
naturally without having to be put in ‘by hand’, as was necessary in the 
wavefunction approach when applied to an antifermion. 

We are now ready to look at some more realistic (and covariant) processes. 


SS 
8.3 e st scattering 

8.3.1 The amplitude for e~st — e~st 

We consider the two-body scattering process 


e (k,s)+st(p) > e (k’,s’) +s (p’) (8.89) 
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FIGURE 8.4 
es? scattering amplitude. 


where the 4-momenta and spins are as indicated in figure 8.4. How will the e7 
and s* interact? In this case, there is no ‘external’ classical electromagnetic 
potential in the problem. Instead, each of e” and st, as charged particles, 
act as sources for the electromagnetic field, with which they in turn inter- 
act. We can picture the process as one in which each particle scatters off 
the ‘virtual’ field produced by the other (we shall make this more precise in 
comment (2) after equation (8.102)). The formalism of quantum field theory 
is perfectly adapted to account for such effects, as we shall see. It is very 
significant that no new interaction is needed to describe the process (8.89) 
beyond what we already have: the complete Lagrangian is now simply the 
free-field Lagrangians for the spin-4 e”, the spin-0 st and the Maxwell field, 
together with the sum of the lowest order scalar electromagnetic interaction 
Hamiltonian of (8.22), and the Dirac interaction Hamiltonian of (7.135) with 
q = —e. The full interaction Hamiltonian is then 


(a) = [eóo0ró() — "6" (2) 4(@)) — eb(w)y"h(a)]A, (a) (8.90) 


where the ‘total current’ in (8.91) is just the indicated sum of the ¢ (scalar) 
and 4 (spinor) currents. This H’ must now be used in the Dyson expansion 
(6.42), in a perturbative calculation of the e~st — e~st amplitude. 

Note now that, in contrast to our Coulomb scattering ‘warm-ups’, the elec- 
tromagnetic field is quantized in (8.90). We first observe that, since there are 
no free photons in either the initial or final states in our process es? — est, 
the first-order matrix element of H’ must vanish (as did the corresponding 
first-order amplitude in AB > AB scattering, in section 6.3.2). The first 
non-vanishing scattering processes arise at second order (cf (6.74)): 


As = E S f ata daa ATA AY A 
x (16EpEp Ep Ep)*/?. (8.92) 


Just as for AB > AB and the C field in the ‘ABC’ model (cf (6.81)), as far 
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as the A, operators in (8.92) are concerned the only surviving contraction is 
(0/7 (Aj (z1)Â, (z2))10) (8.93) 


which is the Feynman propagator for the photon, in coordinate space. As 
regards the rest of the matrix element (8.92), since the â's and ¿'s commute 
the ‘st’ and “e”? parts are quite independent, and (8.92) reduces to 


C Sf ater daa Et Pinar) st POT (Ay (0), (elo) 
x (e7,k, 852 e(2z2)le, k,s) + (a1 © 22)). (8.94) 


But we know the explicit form of the current matrix elements in (8.94), from 
(8.27) and (8.55). Inserting these expressions into (8.94), and noting that the 
term with zı + o is identical to the first term, one finds (cf (6.102) and 
problem 8.10) 


Ag-s+ = î(27)%5%(p + k — pl — RI) Most (8.95) 
where (using the general form (7.122) of the photon propagator) 
il-guv + (1 = €)quqv/¢? 
e = (-i)2(e(p + p')n) | 
x (—eū(k',s')y ulk, s)) (8.96) 
sa i[-guv + U = Ea YN 
= (01025 (pp!) [A ¿Y (k,k') (8.97) 
and q = (k — k') = (p' — p). We have introduced here the “momentum-space” 
currents 
ji (p, p’) = elp + p')" (8.98) 
and 
jt (k,k') = —eū(k', s")yHu(k, 8) (8.99) 


shortening the notation by dropping the ‘em’ suffix, which is understood. 
Before proceeding to calculate the cross section, some comments on (8.97) 
are in order: 


Comment (1) 


The jf, (p, p’) and j¿_ (k, k”) in (8.98) and (8.99) are the momentum-space ver- 
sions of the z-dependent current matrix elements in (8.27) and (8.55); they are, 
in fact, simply those matrix elements evaluated at x = 0. The x-dependent 
matrix elements (8.27) and (8.55) both satisfy the current conservation equa- 
tions 0,j"(x) = 0 as is easy to check (problem 8.11). Correspondingly, it 
follows from (8.98) and (8.99) that we have 


Quiet (p, p') = quia (k, k’) = 0 (8.100) 
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where q = p' — p = k — k’, and we have used the mass-shell conditions p? = 
p? = M?, ku = mu, Ku’ = mw’; the relations (8.100) are the momentum- 
space versions of current conservation. The €-dependent part of the photon 
propagator, which is proportional to q*'q”, therefore vanishes in the matrix 
element (8.97). This shows that the amplitude is independent of the gauge 
parameter € — in other words, it is gauge invariant and proportional simply 
to 

if rus (8.101) 


Comment (2) 


The amplitude (8.97) has the appealing form of two currents ‘hooked together” 
by the photon propagator. In the form (8.101), it has a simple ‘semi-classical’ 
interpretation. Suppose we regard the process e~s* — e~s* as the scattering 
of the e”, say, in the field produced by the st (we can see from (8.101) that 
the answer is going to be symmetrical with respect to whichever of e~ and st 
is singled out in this way). Then the amplitude will be, as in (8.43), 


Ags = =i | dt gt (Dero, (o) (8.102) 


where now the classical field A, (x) is not an ‘external’ Coulomb field but the 
field caused by the motion of the s+. It seems very plausible that this 4, (zx) 
should be given by the solution of the Maxwell equations (2.22), with the 
jvem(x) on the right-hand side given by the transition current (8.11) (with 
N = N' = 1) appropriate to the motion s*(p) > st (p’): 


AY — 8” (B"Ay) = j% (2) (8.103) 


where l 
În (a) = e(p + p'e e-e), (8.104) 


Equation (8.103) will be much easier to solve if we can decouple the compo- 
nents of A” by using the Lorentz condition 0“ A,, = 0. We are aware of the 
problems with this condition in the field-theory case (cf section 7.3.2) but we 
are here treating A” classically. Although A” is not a free field in (8.103), it is 
easy to see that we may consistently take O” A, = 0 provided that the current 
is conserved, 0,j%,(«) = 0, which we know to be the case. Thus we have to 
solve 


A” (x) =e(p+p e iP), (8.105) 


Noting that 


e-ilp-P')2 = —(p— y y e HP (8.106) 


we obtain, by inspection, 


1 F , 
A” (z) = -elp dep jeep ye (8.107) 
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FIGURE 8.5 

Feynman diagram for e~s* scattering in the one-photon exchange approxi- 

mation. 

where q = p' — p. Inserting this expression into the amplitude (8.102) we find 
Ae-s+ = i(2n)454(p + k — p' — k')Me-s+ (8.108) 


where 


¡Mee = jé (p P) Bede (ks k’) (8.109) 
exactly as in (8.97) for £ = 1 (the gauge appropriate to ‘0, A” = 0’). 


Comment (3) 


From the work of chapter 6, it is clear that we can give a Feynman graph 
interpretation of the amplitude (8.109), as shown in figure 8.5, and set out 
the corresponding Feynman rules: 


(i) At a vertex where a photon is emitted or absorbed by an st particle, 
the factor is —ie(p + p')* where p,p’ are the incident and outgoing 
4-momenta of the st; the vertex for s~ has the opposite sign. 


(ii) At a vertex where a photon is emitted or absorbed by an e”, the 
factor is iey(e > 0); for an e* it is —iey. (This and the previous 
rule arise from associating one ‘(—i)’ factor in (8.94) or (8.97) with 
each current.) 


(iii) For each initial state fermion line a factor u(k,s) and for each fi- 
nal state fermion line a factor u(k”,s”); for each initial state an- 
tifermion a factor 0(k,s) and for each final state antifermion line a 
factor u(k’, s”) (these rules reconstruct the et Coulomb amplitudes 
of section 8.2.4). 


(iv) For an internal photon of 4-momentum q, there is a factor —igyw/q7 
in the gauge € = 1. 
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(v) Multiplying these factors together gives the quantity iM; multi- 
plying the result by an overall 4-momentum-conserving 5-function 
factor (27)*6(p! + k’ +--- —p—k-—---) gives the quantity A. 
Comment (4) 


We know that our amplitude is proportional to 


u g A 
ies oe (8.110) 
q 
Choosing the coordinate system such that q = (q°,0,0,|q|), the current con- 
servation equations q : jg+ = q: Je- = 0 read: 
3% = 93 la (8.111) 


for both currents. Expression (8.101) can then be written as 
Ud FAME Pie dd d 
= lid Hit dado ld (8.112) 


using (8.111). The first term may be interpreted as being due to the exchange 
of a transversely polarized photon (only the 1,2 components enter, perpen- 
dicular to q). For real photons q? — 0, so that this term will completely 
dominate the second. The latter, however, must obviously be included when 
q? # 0, as of course is the case for this virtual y (cf section 6.3.3). We note 
that the second term depends on the 3-momentum squared, q?, rather than 
the 4-momentum squared q”, and that it involves the charge densities j°, and 
e de Referring back to section 7.1, we can interpret it as the instantaneous 
Coulomb interaction between these charge densities, since 


Janer): = Jure), = 47 /q?. (8.113) 


Thus, in summary, the single covariant amplitude (8.109) includes contribu- 
tions from the exchange of transversely polarized photons and from the fa- 
miliar Coulomb potential. This is the true relativistic extension of the static 
Coulomb results of (8.15) and (8.44). 


8.3.2 The cross section for est — est 
The invariant amplitude M,-,+(s,s’) for our process is given by (8.109) as 
Mo-s+ (8, 8’) = eti(k’, sul, (gu / delo +p)” (8.114) 


where we have now included the spin dependence of the amplitude M,-,+ in 
the notation. The steps to the cross sections are now exactly as for the spin-0 
case (section 6.3.4), as modified by the spin summing and averaging already 
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met in sections 8.2.1 and 8.2.3, particularly the latter. The cross section for 
the scattering of an electron in spin state s to one in spin state s’ is (cf (6.110)) 


1 


= = 1\|2 454 /7,/ Mo ra 
doss = ro le ot (8, s)| (2r) 8t (k +p —k-— p) 
1 ak dp! 
ain 
“o Qu! DE! (sits) 
where we have defined 
k” = (w, k) k” = (w’, k’) 
p" =(E,p) pl" =(E',p’). (8.116) 


For the unpolarized cross section we are required, as in (8.46), to evaluate 
the quantity 


ED Mes (s,s = (5) 5 ae, Hulk, sat s) ulk’, st) 


gar q 3,38" 
x (prp)u(p+p), (8.117) 
e? i P 
= (5) hee tutor! (8.118) 


where the boson tensor T,» is just (p + p')u(p +p’), and the lepton tensor 
LH” has been evaluated in (8.79). Using q? = (k — k’)? = 2m? — 2k - k’, the 
expression (8.79) can be rewritten as 


Dr (k, k’) = 2[k'" k” + k” k” + (q?/2) gh”). (8.119) 
We then find (problem 8.12) 
LYT, = 8[2(p- k)(p  k') + (q?/2)M?] (8.120) 
since k’ -p' = k- p and k- p' = k' - p from 4-momentum conservation, and 
p? = p’? = M? (we are using m for the e~ mass and M for the st mass). 


We can now give the differential cross section in the CM frame by taking 
over the formula (6.129) with 


|M]? =F Y Ma (s, s)|? 


s,s! 


so as to obtain 


do 2a? , 5 > 
(55)... = Wrage Pe Be )+ ta /2)M | (8.121) 


where a = e2/4x and W? = (k + p)?. 
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FIGURE 8.6 
Two-body scattering in the ‘laboratory’ frame. 


A somewhat more physically meaningful formula is found if we ask for 
the cross section in the ‘laboratory’ frame which we define by the condition 
pt = (M,0). The evaluation of the phase space integral requires some care 
and this is detailed in appendix K. The result is 


do a? 


k! 
OF eg 8.122 
aM a) “9 (0/23 on) 


In this formula we have neglected the electron mass in the kinematics so that 


k = |k| =u (8.123) 
k = |k'| =o! (8.124) 

and 
q? = —4kk’ sin? (0/2) (8.125) 


where @ is the electron scattering angle in this frame, as shown in figure 8.6, 
and 
(k/k') = 1+ (2k/M) sin? (0/2) (8.126) 


from equation (K.20). Note that there is a slight abuse of notation here: in the 
context of results for such laboratory frame calculations, ‘k’ and ‘k” are not 
4-vectors, but rather the moduli of 3-vectors, as defined in equations (8.123) 
and (8.124). 

We shall denote the cross section (8.122) by 


($) ‘no-structure’ cross section. (8.127) 
ns 

It describes essentially the ‘kinematics’ of a relativistic electron scattering 
from a pointlike spin-0 target which recoils. Comparing the result (8.122) 
with equation (8.49), and remembering that here Z = 1 and we are taking 
v — 1 for the electron, we see that the effect of recoil is contained in the 
factor (k'/k), in this limit. We recover the ‘no-recoil’ result (8.49) in the 
limit M — oo, as expected. In particular, referring to (8.125), we understand 
Rutherford’s ‘sin~* 9 /2’ factor in terms of the exchange of a massless quantum, 
via the propagator factor (1/q?)?. 
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FIGURE 8.7 
er? scattering amplitude. 


This ‘no-structure’ cross section also occurs in the cross section for the 
scattering of electrons by protons or muons: the appellation ‘no-structure’ 
will be made clearer in the discussion of form factors which follows. As in 
the case of ef Coulomb scattering, the cross sections for e~st and for ets? 
scattering are identical at this (lowest) order of perturbation theory. 


E  .  —.  —. o —————— 


8.4 Scattering from a non-point-like object: the pion 
form factor in ext — ext 


As remarked earlier, we have been careful not to call the ‘st’ particle a n”, 
because the latter is a composite system which cannot be expected to have 
point-like interactions with the electromagnetic field, as has been assumed 
for the st; rather, in the case of the 7? it is the quark constituents which 
interact locally with the electromagnetic field. The quarks also, of course, 
interact strongly with each other via the interactions of QCD, and since these 
are strong they cannot (in this case) be treated perturbatively. Indeed, a 
full understanding of the electromagnetically probed ‘structure’ of hadrons 
has not yet been achieved. Instead, we must describe the e~ scattering from 
physical 7*’s in terms of a phenomenological quantity — the pion form-factor 
— which encapsulates in a relativistically invariant manner the ‘non-point-like’ 
aspect of the hadronic state 7. 


The physical process is 
em (k, 8) ++ (p) > e~ (1,9) + r+ (p') (8.128) 


which we represent, in general, by figure 8.7. To lowest order in a, the ampli- 
tude is represented diagrammatically by a generalization of figure 8.5, shown 
in figure 8.8, in which the point-like ssy vertex is replaced by the rry ‘blob’, 
which signifies all the unknown strong interaction corrections. 
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FIGURE 8.8 
One-photon exchange amplitude in e~z* scattering, including hadronic cor- 
rections at the mmy vertex. 


8.4.1 e” scattering from a charge distribution 


It is helpful to begin the discussion by returning to e~ Coulomb scattering 
again, but this time let us consider the case in which the potential A? (æ) 
corresponds, not to a point charge, but to a spread-out charge density p(x). 
Then 4%(x) satisfies Poisson’s equation 


V? A? (æ) = —Zep(a). (8.129) 
Note that if A” (x) = Ze/4r|æ]| as in (8.13) then p(x) = (x) (see appendix G) 
and we recover the point-like source. The calculation of the Coulomb matrix 


element will proceed as before, except that now we require, at equation (8.43), 
the Fourier transform 


A0(q) = ferran (8.130) 


where q = k — k'. To evaluate (8.130), note first that from the definition of 
A(x), we can write 


fora = SS 
= —ZeF(q) (8.131) 


where the (static) form factor F(q) has been introduced, the Fourier transform 
of p(x), satisfying 


P(0) = fría) n=, (8.132) 
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Condition (8.132) simply means that the total charge is Ze. The left-hand side 
of (8.131) can be transformed by two (three-dimensional) partial integrations 
to give 


O dix = -q? | 424%) dix. (8.133) 
Using this result in (8.131), we find 
~ F 
A0(q) = — Ze. (8.134) 


Thus referring to equation (8.44) for example, the net result of the non-point- 
like charge distribution is to multiply the ‘point-like’ amplitude Ze?/q? by 
the form factor F(q) which in this simple static case has the interpretation of 
the Fourier transform of the charge distribution. So, for this (infinitely heavy 
TF case), the ‘blob’ in figure 8.8 would be represented by F(q). 

To gain some idea of what F(q?) might look like, consider a simple expo- 
nential shape for p(a) : 


1 — a 
p(x) = TON 2l (8.135) 
which has been normalized according to (8.132). Then F(q?) is (problem 8.13) 
1 
Fa) = 5. ell 
(q?) (gta? FIE (8.136) 


We see that F(q?) decreases smoothly away from unity at q? = 0. The char- 
acteristic scale of the fall-off in |q| is ~ a~! from (8.136), which, as expected 
from Fourier transform theory, is the reciprocal of the spatial fall-off, which is 
approximately a from (8.135); the root mean square radius of the distribution 
(8.135) is actually 12a (problem 8.13). Since q? = 4k’ sin? 0/2, a larger q? 
means a larger 0: hence, in scattering from an extended charge distribution, 
the cross section at larger angles will drop below the point-like value. This is, 
of course, how Rutherford deduced that the nucleus had a spatial extension. 

We now seek a Lorentz-invariant generalization of this static form factor. 
In the absence of a fundamental understanding of the 7? structure coming 
from QCD, we shall rely on Lorentz invariance and electromagnetic current 
conservation (one aspect of gauge invariance) to restrict the general form of 
the my vertex shown in figure 8.8. The use of invariance arguments to place 
restrictions on the form of amplitudes is an extremely general and important 
tool, in the absence of a complete theory. 


8.4.2 Lorentz invariance 


First, consider Lorentz invariance. We seek to generalize the point-like ssy 
vertex (cf (8.98) and comment (1) after (8.99)) 


jÉ (p,p) =( 19 E, (0) 8? p) = elp +p)” (8.137) 
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to j",(p,p’), which will include strong interaction effects. Whatever these 
effects are, they cannot destroy the 4-vector character of the current. To 
construct the general form of j”, (p, p”) therefore, we must first enumerate the 
independent momentum 4-vectors we have at our disposal to parametrize the 
4-vector nature of the current. These are just 


p p' and q (8.138) 


subject to the condition 
p =p+q. (8.139) 
There are two independent combinations; these we can choose to be the linear 
combinations 
(p + pu (8.140) 
and 
(p' — P)u = du: (8.141) 
Both of these 4-vectors can, in general, parametrize the 4-vector nature of the 
electromagnetic current of a real pion. Moreover, they can be multiplied by 
an unknown scalar function of the available Lorentz scalar products for this 
process. Since 
p =p” = M? (8.142) 
and 
q = 2M? — 2p- p' (8.143) 
there is only one independent scalar in the problem, which we may take to be 


q?, the 4-momentum transfer to the vertex. Thus, from Lorentz invariance, 
we are led to write the electromagnetic vertex of a pion in the form 


jks (pp) = (+, pS n (OTt, p) = el (a°) +p)” + G(q?)q"]. (8.144) 


The functions F and G are called ‘form factors’. 

This is as far as Lorentz invariance can take us. To identify the pion form 
factor, we must consider our second symmetry principle, gauge invariance — 
in the form of current conservation. 


8.4.3 Current conservation 


The Maxwell equations (7.65) reduce, in the Lorentz gauge 
0, A" =0 (8.145) 


to the simple form 


Ah = je (8.146) 


and the gauge condition is consistent with the familiar current conservation 


condition 
Oj" = 0. (8.147) 
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As we have seen in (8.100), the current conservation condition is equivalent 
to the condition Ñ 
u(t (Plita, (0) (7 (p)) = 0 (8.148) 


on the pion electromagnetic vertex. 
In the case of the point-like s* this is clearly satisfied since 


q: (P +p) =0 (8.149) 
with the aid of (8.142). In the general case we obtain the condition 
qulF (q7)(p' + p)" + G(q?)q"] = 0. (8.150) 


The first term vanishes as before, but q? 4 0 in general, and we therefore 
conclude that current conservation implies that 


G(q?) =0. (8.151) 


In other words, all the virtual strong interaction effects at the may ver- 
tex are described by one scalar function of the virtual photon’s squared 4- 
momentum: 
/ H F 2 / A 
A. sa "UE (8.152) 
point pion real pion 


F(q?) is the electromagnetic form factor of the pion, which generalizes the 
static form factor F(q?) of section 8.4.1. The pion electromagnetic vertex is 
then 


je, (op) = eF (°) (p +p)". (8.153) 


The electric charge is defined to be the coupling at zero momentum transfer, 
so the form factor is normalized by the condition (cf (8.132)) 


F(0) =1. (8.154) 


To lowest order in a, the invariant amplitude for ext — e~ 7x7 is therefore 
given by replacing j“, (p, p”) in (8.97) or (8.109) by j% (p, p’): 


iMe- + = —ie(p + p) F(p = py) (eno) [+iew(k’, s')yulk, s)]. 


(=p) 
(8.155) 
It is clear that the effect of the pion structure is simply to multiply the ‘no- 
structure’ cross section (8.122) by the square of the form factor, F(q? = 


(p' — p)?). 
For ext — ext in the CM frame we may take p = (E,p) and p! = 
(E, p') with |p| = |p'| and E = (m2 + p?)'/?. Then 
q = (p' — p)? = —4p? sin” 0/2 (8.156) 


as in section 8.1, where 0 is now the CM scattering angle between p and p’. 
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FIGURE 8.9 
ete” — ata scattering amplitude. 


Hence F(q?) can be probed for negative (space-like) values of q?, in the process 
em? — ext. As in the static case, we expect the form factor to fall off 
as —q? increases since, roughly speaking, it represents the amplitude for the 
target to remain intact when probed by the electromagnetic current. As —q? 
increases, the amplitudes of inelastic processes which involve the creation of 
extra particles become greater, and the elastic amplitude is correspondingly 
reduced. We shall consider inelastic scattering in the following chapter. 

Interestingly, F(q?) may also be measured at positive (time-like) q?, in the 
related reaction ete” => mm as we now discuss. 


rn o SeSe 


8.5 The form factor in the time-like region: ete” > ntr” 
and crossing symmetry 


The physical process is 
et (ka, 81) +e (k, s) > at (p') +77 (p1) (8.157) 


as shown in figure 8.9. We can use this as an instructive exercise in the Feyn- 
man interpretation of section 3.4.4. From that section, we know that the 
invariant amplitude for (8.157) is equal to minus the amplitude for a process 
in which the ingoing antiparticle et with (k1, s1) becomes an outgoing particle 
e with (—k,,—s1), and the outgoing antiparticle 7” with pı becomes an in- 
going particle r™ with —p,. In this way the ‘physical’ (positive 4-momentum) 
antiparticle states (et and m7) are replaced by appropriate ‘unphysical’ (neg- 
ative 4-momentum) particle states (e~ and 7+). These changes transform 
figure 8.9 to figure 8.10. 

If we now look at figure 8.10 ‘from the top downwards’ (instead of from left 
to right — remember that Feynman diagrams are not in coordinate space!), we 
see a process of er? scattering, namely 


e (k, s) +07 (—p1) > e (=k1,—s1) + 7* (p’). (8.158) 
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FIGURE 8.10 
The amplitude of figure 8.9, with positive 4-momentum antiparticles replaced 
by negative 4-momentum particles. 


FIGURE 8.11 

The amplitude of figure 8.10 redrawn so as to obtain a reaction in which the 
initial state has only ‘ingoing’ lines and the final state has only ‘outgoing’ 
lines. 


FIGURE 8.12 
One-photon exchange amplitude for the process of figure 8.11. 


8.5. The form factor in the time-like region: ete” => nn and crossing symmetry 249 


FIGURE 8.13 
One-photon exchange amplitude for the process of figure 8.9. 


But (8.158) is something we have already calculated! (Though we shall have 
to substitute a negative-energy spinor v for a positive energy one u.) In fact, 
let us redraw figure 8.10 as figure 8.11 to make it look more like figure 8.7. 
Then, to lowest order in a, the amplitude for figure 8.11 is shown in figure 8.12 
(compare figure 8.8). To obtain the corresponding mathematical expression 
for the amplitude iMe+e-—r+r-, we simply need to modify (8.155): (i) by 
inserting a minus sign; (ii) by replacing p by —p1 and k’ by —k; as in fig- 
ure 8.12; and (iii) by replacing u(k”, s’) by 0(k1, $1). This yields the invariant 
amplitude for figure 8.12 as 


iM. eunt = —ie(— + NEF + py? ( —19 yw ) 
+o->m+ Carr +20) | Cone 


x [—ie0(k1,s1)y“u(k, s)| (8.159) 


which is represented by the Feynman diagram of figure 8.13 for the original 
process of (8.157) and figure 8.9. 

In the language introduced in section 6.3.3, figure 8.13 is an ‘s-channel 
process’ (s = (k + ki)? = (pı + p’)?) for ete > mim”, whereas figure 
8.8 is a ‘t-channel process’ (t = (k — k’)? = (p' — p)?) for er? — eo nt. 
However, we have seen that the amplitude for the ete” — n*m" process can 
be obtained from the ex? — ext amplitude by making the replacement 
k + —k1,p > —pu (together with the sign, and u > 0). Under these 
replacements of the 4-momenta, the variable t = (k — k’)? = (p — p’)? of 
figure 8.8 becomes the variable s = (k + kı)? = (pı +p’)? of figure 8.13. In 
particular, as is evident in the formula (8.159), the same form factor F is a 
function of the invariant s = (pı + p’)? in process (8.157), and of t = (p— p’)? 
in process (8.128). The interesting thing is that whereas (as we have seen) 
“P is negative in process (8.128), ‘s’ for process (8.157) is the square of the 
total CM energy, which is > 4M? where M is the pion mass (2M is the 
threshold energy for the reaction to proceed in the CM system). Thus the 
form factor can be probed at negative values of its argument in the process 
er? =>e=1*, and at positive values > 4M? in the process ete” 3 mtr”. 
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In the next chapter (section 9.5) we shall see how, in the latter process, meson 
resonances dominate F(s). 

The procedure whereby an ingoing/outgoing antiparticle is switched to 
an outgoing/ingoing particle is called ‘crossing’ (the state is being ‘crossed’ 
from one side of the reaction to the other). By an extension of this language, 
ete” => mem” is called the crossed process relative to er? > er? (or 
vice versa). The fact that the amplitude for a given process and its ‘crossed’ 
analogue are directly related via the Feynman interpretation (or by quantum 
field theory!) is called ‘crossing symmetry’. In the example studied here, what 
is an s-channel process for one reaction becomes a t-channel process for the 
crossed reaction. Essentially, little more is involved than looking in the one 
case from left to right and, in the other, from top to bottom! 


E a 
8.6 Electron Compton scattering 
8.6.1 The lowest-order amplitudes 


We proceed to explore some other elementary electromagnetic processes. So 
far we have not considered a reaction with external photons, so let us now 
discuss electron Compton scattering 


qlk, A) +e (p,s) > ylk, X) +e (p', s”) (8.160) 


where the A's stand for the polarizations of the photons. Since only the y's 
and e” 's are involved, the interaction Hamiltonian is simply HL, and it is 
clear that this must act at least twice in the reaction (8.160). By following 
the method of section 6.3.2 one can formally derive what we are here going to 
assume is by now obvious, which is that to order e? (i.e. a in the amplitude) 
there are two contributing Feynman graphs, as shown in figures 8.14(a) and 
(b). The first is an s-channel process, the second a u-channel process. We 
already know the factors for the vertices and for the external electron lines; we 
need to know the factors for the internal electron lines (propagators) and the 
external photon lines. The fermion propagator was given in section 7.2 and is 
i/(¢ — m + ie) for a line carrying 4-momentum q. As regards the ‘external-y’ 
factor, this will arise from contractions of the form (cf (6.90)) 


V2Ep (Ola(k’, A) Ar (21) 10) = e (k', A) "2 (8.161) 


where the evaluation of the vev has used the mode expansion (7.104) and the 
commutation relations (7.108), as usual; note, however, that only transverse 
polarization states (A, A’ = 1 and 2) enter in the external (physical) photon 
lines in figures 8.14(a) and (b). 
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FIGURE 8.14 
O(e2) contributions to electron Compton scattering. 


Thus we add two more rules to the (i)-(v) of section 8.3.1: 
(vi) For an incoming photon of 4-momentum k and polarization A, there 
is a factor e(k, A); for an outgoing one, e”*(k', A'). 
(vii) For an internal spin-4 particle carrying 4-momentum q, there is a 
factor i/(¢ — m + ie) = i(g + m)/(q? — m? + ie). 
The invariant amplitude M,e- corresponding to figures 8.14(a) and (b) is 
therefore 


(p+k+m) 


+h 27 Ul, 8) 


— eet (kl, Ne, (k, JU, Si EE u(p,s). (8.162) 


My- = -%e vk’, Neulk, Aulp', sh” 


To get the spinor factors in expressions such as these, the rule is to start 
at the ingoing fermion line (‘u(p,s)’) and follow the line through until the 
end, inserting vertices and propagators in the right order, until you reach the 
outgoing state (‘u’). Note that here s = (p +k)? and u = (p — k’)?. 


8.6.2 Gauge invariance 


We learned in section 7.3.1 that the gauge symmetry (At > A — Ox) of 
electromagnetism, as applied to real free photons, implied that any photon 
polarization vector e*(k, A) could be replaced by 


'L(kA) = e” (k, A) + Bk* (8.163) 


where PB is an arbitrary constant. Such a transformation amounted to a change 
of gauge, always remaining within the Lorentz gauge for which e-k = e-k = 0. 
Thus our amplitude (8.162) must be unchanged if we make either or both the 
replacements e +e€ + Sk and e* > e* + Bk' indicated in (8.163). This means 
that if in (8.162) we replace either or both of e,,(k, A) and ex(k”, A) by ku 
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FIGURE 8.15 
General one-photon process. 


and k/,, respectively, the result has to be zero. This can indeed be verified 
(problem 8.14). 

A similar result is generally true and very important. Consider a process, 
shown in figure 8.15, involving a photon of momentum k*, whose polarization 
state is described by the vector e. The amplitude A, for this process must 


be linear in the photon polarization vector and thus we may write 
A, = Tu (8.164) 


where T, depends on the particular process under consideration. With the 
Lorentz choice for e” we have 
k-e=0. (8.165) 


But gauge invariance implies that if we replace e” in (8.164) by k” we must 


get zero: 
(8.166 


This important condition on T, is known as a Ward identity (Ward 1950). 


8.6.3 The Compton cross section 


The calculation of the cross section is of considerable interest, since it is re- 
quired when considering lowest-order QCD corrections to the parton model 
for deep inelastic scattering of leptons from nucleons (see the following chap- 
ter and volume 2). We must average |.M_e- |? over initial electron spins and 
photon polarizations and sum over final ones. Consider first the s-channel 
process of figure 8.14(a), with amplitude Me. For this contribution we 
must evaluate 


4 


e 4 eriat > 
As may? So ely (p+ km) uy (p+ k+m)y7u' (8.167) 
A) ,8,8! 


where we have shortened the notation in an obvious way and introduced the 
invariant Mandelstam variable (section 6.3.3) s = (p + k)?. We know how to 
write the spin sums in a convenient form, as a trace. We need to find a similar 
trick for the polarization sum. 
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Consider the general ‘one-photon’ process shown in figure 8.15, with am- 
plitude A, = e” (k, A)T,,, where e(k, 1) = (0,1,0,0) and e”(k, 2) = (0,0, 1,0), 
and k” = (k,0,0,k). Then the required polarization sum would be 

XO (k, AT pe" (k, ANTS = DP + T}. (8.168) 
A=1,2 
However, we also know that k*T,, = 0 from the Ward identity (8.166). This 
tells us that 


kTy — kT, = 0 (8.169) 
and hence To = T3. It follows that we may write (8.168) as 
Y (AE ATAT: = (Til? + [Zo]? + ITs]? — [To]? (8.170) 
A=1,2 
= g" T,T*. (8.171) 


Thus we may replace the non-covariant expression ‘Ža et (k, A) * (k, A)’ 
by the covariant one '—g*””. The reader may here recall equation (7.118), 
where the ‘pseudo-completeness’ relation involving all four e's was given, a 
similarly covariant expression. This relation corresponds exactly to the right- 
hand side of (8.170), which (in these terms) shows that the A = 0 state enters 
with negative norm. 

Using this result, the term (8.167) becomes 


4 


ae SN ay (pt K+ myy uny + b+ mu 


el 


= 25 me Ep +m) Gt E+ mht ml + B+ m) 
(8.172) 


where, in the second step, we have moved the y, to the front of the trace, 
using (8.71). Expression (8.172) involves the trace of eight y matrices, which 
is beyond the power of the machinery given so far. However, it simplifies 
greatly if we neglect the electron mass — that is, if we are interested in the 
high-energy limit, as we shall be in parton model applications. In that case, 
(8.172) becomes 


el 
q pr ppp + K) (8.173) 
which we can simplify using the result (3.3) to 
el 
za Telp (p + Rp + K) (8.174) 
= mp using p =p? =0 (8.175) 
de? 


Siro 2(p' - k)(p- k) using (8.76) and k? =0 (8.176) 
= —2e4u/s (8.177) 
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FIGURE 8.16 
ep scattering amplitude. 


where u = (p — k')?. Problem 8.15 finishes the calculation, with the result 
that the spin-averaged squared amplitude is 


A Y My- = —2¢4 (= + =) (8.178) 


s,s, A,A! 


The cross section in the CMS is then (cf (6.129)) 


4 2 
(3-3) (8.179) 
d(cos@) 64r?s \ s u 5 s u 
For parton model calculations, what is actually required is the analogous 
quantity calculated for the case in which the initial photon is virtual (see 
section 9.2). However, the discussion of section 7.3.2 shows that we may 
still use the polarization sum (8.170). A difference will arise in passing from 
(8.175) to (8.176) where we must remember that k? 4 0. Since k? will be 
space-like, we put k? = —Q? and find (problem 8.16) that the spin-averaged 
squared amplitude for the virtual Compton process 


a*(k? =-Q?) +e 3 yt+e7 (8.180) 
is given by 
2 
—2e4 (: aes t) . (8.181) 
S u su 


E: See 


8.7 Electron muon elastic scattering 


Our final examples of electrodynamic processes are ones in which two fermions 
interact electromagnetically. In this section we discuss the scattering of two 
point-like fermions (i.e. leptons); in the following one we look at the change 
(analogous to those for the TY as compared to the st) necessitated when one 
fermion is a hadron, for example the proton. 


&.7. Electron muon elastic scattering 255 


FIGURE 8.17 
One-photon exchange amplitude in e” y” scattering. 


We shall consider eu” elastic scattering: our notation is indicated in fig- 
ure 8.16. In the lowest order of perturbation theory — the one-photon exchange 
approximation — we can draw the relevant Feynman graph for this process. 
This is shown in figure 8.17. All the elements for the graph have been met 
before and so we can immediately write down the invariant amplitude which 
now depends on four spin labels: 


Me-y- lr, 831", 8") = eti(k’, 8’ )qyulk, 5)(g"” /g )eu(p', r vulp,r). (8.182) 


Although experiments with polarized leptons are not uncommon, we shall 
only be concerned with the unpolarized cross section 


dj Y] Men (nsir) 


ryr!3s,s! 


2, (8.183) 


We perform the same manipulations as in our e~s* example and the cross 
section reduces to a factorized form involving two traces: 


2 nee = (5) (am + mill tm) 


x 43 Tr[(p + M)” (p+ M)y”]} (8.184) 
= (/8 Ly MP" (8.185) 


where L,,, is the ‘electron tensor’ calculated before (see (8.119)): 
Luv = 2[k ky + kL Ry + (97 /2)9yw] (8.186) 


but now M*” is the appropriate tensor for the muon coupling, with the same 
structure as Ly: 


MM =2p'"p” + p” p" + (q?/2)g". (8.187) 
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To evaluate the cross section we must perform the ‘contraction’ LMI”. 
A useful trick to simplify this calculation is to use current conservation for the 
electron tensor Lv. For the electron transition current, the electromagnetic 
current conservation condition is (cf equation (8.100)) 


q lulk, s’)y,u(k, s)| = 0 (8.188) 


i.e. independent of the particular spin projections s and s’. Since Lav is 
the product of two such currents, summed and averaged over polarizations, 
current conservation implies the conditions 


q Luv = Q Luv =0 (8.189) 


which can be explicitly checked using our result for L av. The usefulness of 
this result is that in the contraction L,,,M"” we can replace p' in M'” by 
(p + q) and then drop all the terms involving q’s, i.e. 


Luv MY" = Ly Meg (8.190) 
where 
Mig = 2[2p*p” + (q7/2)9*]. (8.191) 


The calculation of the cross section is now straightforward. In the ‘laboratory’ 
system, defined (unrealistically) by the target muon at rest 


p” = (M,0,0,0) (8.192) 


with M now the muon mass, the result is (problem 8.17(a)) 


do do q? tan2(0/2) 
=== |. sl 
do (5). ( 2M? eee 


Note the following points: 


Comment (a) 


The ‘no-structure’ cross section (8.122) for est scattering now appears modi- 
fied by an additional term proportional to tan?(0/2). This is due to the spin-4 
nature of the muon which gives rise to scattering from both the charge and 
the magnetic moment of the muon. 


Comment (b) 


In the kinematics the electron mass has been neglected, which is usually a 
good approximation at high energies. We should add a word of explanation 
for the ‘laboratory’ cross sections we have calculated, with the target muon 
unrealistically at rest. The form of the cross section, (da/dQ)ns, and of the 
cross section for the scattering of two Dirac point particles, will be of great 
value in our discussion of the quark parton model in the next chapter. 
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Comment (c) 


The crossed version of this process, namely ete” — ppt u7, is a very important 
monitoring reaction for electron—positron colliding beam machines. It is also 
basic to a discussion of the predictions of the quark parton model for ete” => 
hadrons, which will be discussed in section 9.5. An instructive calculation 
similar to this one leads to the result (see problem 8.18) 


do a? 
— = — (1 + cos” 8.194 

E = Gaull + 008" 8) (8.194) 
where all variables are defined in the eFe” CM frame, q? is now the square of 
the CM energy, and the electron and muon masses have been neglected. The 
total cross section, in the one-photon exchange approximation, is then 


o = 4ro?/3q? = 86.8 nb/q?(GeV?), (8.195) 


where we have made use of equation (B.18) of appendix B. 

The energy dependence of this cross section (x 1/42) is important, and 
can be understood by a simple dimensional argument. A cross section has di- 
mensions of a squared length, or in natural units (appendix B) inverse squared 
mass or energy. Here both colliding particles are taken to be pointlike, with 
no form factors involving a length parameter, and the mediating quantum is 
massless. At energies much larger than the lepton masses, the only available 
dimensional quantity is the CM energy. It follows that the cross section must 
be inversely proportional to the square of the CM energy, in this ‘pointlike, 
high energy’ limit. By the same token, deviations from this behaviour would 
be evidence for non-pointlike leptonic structure. 


ra ooo oooBA BF o o  É———0—2—+ 


8.8 Electron—proton elastic scattering and nucleon form 
factors 


In the one-photon exchange approximation, the Feynman diagram for elastic 
electron—proton scattering may be drawn as in figure 8.18, where the ‘blob’ at 
the ppy vertex signifies the expected modification of the point coupling due to 
strong interactions. The structure of the proton vertex can be analysed using 
symmetry principles in the same way as for the pion vertex. The presence 
of Dirac spinors and y-matrices makes this a somewhat involved procedure: 
problem 8.20 is an example of the type of complication that arises. Full de- 
tails of such an analysis can be found in Bernstein (1968), for example. Here, 
however, we shall proceed in a different way, in order to generalize more easily 
to inelastic scattering in the following chapter. We focus directly on the ‘pro- 
ton tensor’ B*”, which is the product of two proton current matrix elements, 
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FIGURE 8.18 
One-photon exchange amplitude in e” p scattering, including hadronic correc- 
tions at the ppy vertex. 


summed and averaged over polarizations, as is required in the calculation of 
the unpolarized cross section (cf (8.57)): 


ge A : " i 
B = ZE (030,5 lit p(0)lp;p, s)((p;p', 5 lic p(0)lp; p, s)) ` (8.196) 


s,s! 


We remarked in comment (a) after equation (8.193) that for e~ scattering 
from a point-like charged fermion an additional term in the cross section 
was present, corresponding to scattering from the target’s magnetic moment. 
Since a real proton is not a point particle, the virtual strong interaction effects 
will modify both the charge and the magnetic moment distribution. Hence 
we may expect that two form factors will be needed to describe the deviation 
from point-like behaviour. This is in fact the case, as we now show using 
symmetry arguments similar to those of section 8.4. 


8.8.1 Lorentz invariance 


B*” must retain its tensor character: this must be made up using the available 
4-vectors and tensors at our disposal. For the spin-averaged case we have only 


p, q and Juv (8.197) 


since p = p+ q. The antisymmetric tensor €„vag (see appendix J) must 
actually be ruled out using parity invariance: the tensor B*” is not a pseudo 
tensor since Ts is a vector. It is helpful to remember that €, og is the 
generalization of €;;, in three dimensions, and that the vector product of two 
3-vectors — a pseudo vector — may be written 


(a x b); = Eijk 0; OK. (8.198) 
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8.8.2 Current conservation 
For a real proton, current conservation gives the condition (cf (8.148)) 
qu (P; P’, S'lihap(0)lp; p, 8) = 0 (8.199) 
which translates to the conditions (cf (8.189)) 
quB” = q BY =0 (8.200) 


on the tensor BY”. 

There are only two possible tensors we can make that satisfy both these 
requirements. One involves p and is constructed to be orthogonal to q. We 
introduce a vector 


Pu = Pu + Op (8.201) 
and require 
q:p=0. (8.202) 
Hence we find 
Bu = Pu — (p- 9/4) (8.203) 


and thus the tensor 
pep" = [p* — (p-a/a*)a"]lp” — (p- q/a7)g”] (8.204) 


satisfies all our requirements. The second tensor must involve g'” and may 
be chosen to be 

=g” + gq’ /q? (8.205) 
which again satisfies our conditions. Thus from invariance arguments alone, 
the tensor B*” for the proton vertex may be parametrized by these two ten- 
sors, each multiplied by an unknown function of q?. If we define 


Br = = 4A(q?)[p" — (p: a/a2)a"]lp” — (p- 4/7) 
+ 2M? B(q*)(—g"” + qq” /0°) (8.206) 


the cross section in the laboratory frame is (problem 8.19) 


do do 2 
mn (5). [A+ Btan*(0/2)). (8.207) 
Formula (8.207) implies that a plot of (da/dQ)/(do/dQ)ns versus tan? 0/2, at 
fixed q?, will be a straight line with slope B and intercept A. 

The functions A and B may be related to the ‘charge’ and ‘magnetic’ form 
factors of the proton. The Dirac ‘charge’ and Pauli ‘anomalous magnetic 
moment’ form factors, Fı and Fz respectively, are defined by 


(ppt se Opie) 
inFa(q?) 


= (+ejalo',s) |y" Fila) + ae 


ao” q,| u(p, s) (8.208) 
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with the normalization 


Fi(0) = 1 (8.209) 
Fa(0) = 1 (8.210) 


and the magnetic moment of the proton is not one (nuclear) magneton, as for 
an electron or muon (neglecting higher-order corrections), but rather pp = 
1+ x with « = 1.79. Problem 8.20 shows that the uyu piece in (8.208) can 
be rewritten in terms of u(p+p”)"u/2M and tio” q,u/2M. The first of these 
is analogous to the interaction of a charged spin-0 particle. As regards the 
second, we note that o” is just 


0” = zi] (8.211) 


which reduces to the Pauli spin matrices for the space-like components 


ij ok 0 
gii = ( A a) (8.212) 


with our representation of y-matrices (0% is a 4x 4 matrix, of is 2 x 2, and i, 
j and k are in cyclic order). The second term in this ‘Gordon decomposition’ 
of uy"u thus corresponds to an interaction via the spin magnetic moment — 
with, in fact, g = 2. Thus the addition of the n term in (8.208) corresponds 
to an ‘anomalous’ magnetic moment piece. In terms of Fı and F> one can 
show that 


A = Fi+reF (8.213) 
B = 27, +F3)? (8.214) 

where 
T = —q?/4M?. (8.215) 


The point-like cross section (8.193) is recovered from (8.207) by setting Fi = 1 
and « = 0 in (8.213) and (8.214). 

The functions F; and F are, in turn, usually expressed in terms of the elec- 
tric and magnetic form factors Gg and Gy, defined by Gp = F¡—TkF23, Gu = 
Fi + Fo. We then find A = (G3 + rG2)/(1+ 7) and B = 27G3,. The cross 
section formula (8.207), written in terms of Gg and Gm, is known as the 
‘Rosenbluth’ cross section. 

Experimental data indicate that the q?-dependences of Gg and Gu for 
the proton, and of Gu for the neutron, are all quite well represented by the 
function F(q?) of (8.136) with q? replaced by —q? and with a ~ 0.84 GeV’, 
at least for values of —q? up to a few GeV? (see, for example, Perkins 1987, 
section 6.5). 

Before we leave elastic scattering it is helpful to look in some more detail 
at the kinematics. It will be sufficient to consider the ‘point-like’ case, which 
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we shall call e~ u”, for definiteness. Energy and momentum conservation at 
the yt vertex gives the condition 


ptq=p (8.216) 
with the mass-shell conditions (M is the uit mass) 
p =p? =M. (8.217) 
Hence for elastic scattering we have the relation 
2p-q =. (8.218) 


It is conventional to relate these invariants to the corresponding laboratory 
frame (p = (M, 0)) expressions. Neglecting the electron mass so that? 


k = |k| =w (8.219) 
k = |k'| =o! (8.220) 
we have 
q? = —2kk'(1 — cos 6) = —4kk’ sin? (0/2) (8.221) 
and 
p-q=M(k—k')=Mv (8.222) 


where v is the energy transfer q? in this frame. To avoid unnecessary minus 
signs, it is convenient to define 


Q? = —q2 = 4kk' sin? (0/2) (8.223) 
and the elastic scattering relation between p -q and q? reads 


v = Q2/2M (8.224) 


k 1 
k — 1+(2k/M)sin?(0/2)' 
Remembering, therefore, that for elastic scattering k’ and 0 are not indepen- 


dent variables, we can perform a change of variables (see appendix K) in the 
laboratory frame 


(8.225) 


dQ. = 2r d(cos 0) = (7/k?) dQ? (8.226) 
and write the differential cross section for e” put scattering as 


do ma? 1 „2 
ig? = manage (0/2) + 27 sin*(0/2)]. (8.227) 


2 As after equation (8.126), note again that in the present context ‘k’ and ‘k’” are not 
4-vectors but the moduli of 3-vectors. 
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FIGURE 8.19 

Physical regions for e~ p scattering in the Q?, v variables: A, kinematically 
forbidden region; B, line of elastic scattering (Q? = 2Mv); C, lines of res- 
onance electroproduction; D, photoproduction; E, deep inelastic region (Q? 
and v large). 


For elastic scattering v is not independent of Q? but we may formally write 
this as a double-differential cross section by inserting the 6-function to ensure 
this condition is satisfied: 


d?o TO? 


Iota 7 RGN feos2(0/2) + (7) sin2(6/2)] 6 6 - 2) : 
(8.228) 


This is the cross section for the scattering of an electron from a point-like 
fermion target of charge e and mass M. 

It is illuminating to plot out the physically allowed regions of Q? and 
v (figure 8.19). Elastic e~p scattering corresponds to the line Q? = 2Mv. 
Resonance production e~p > e~N* with p'? = M” corresponds to lines 
parallel to the elastic line, shifted to the right by M 12 — M? since 


2Mv = Q2+ M”? — M?. (8.229) 


Experiments with real photons, Q? = 0, correspond to exploring along the 
y-axis. In the next chapter we switch our attention to so-called deep inelastic 
electron scattering — the region of large Q? and large v. 
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Problems 


8.1 Consider a matrix element of the form 
M= [eve [a ert a Ate Pa, 


Assuming the integration is over all space-time and that 


A +0 as t => +oo 


and 
|A| + 0 as |x| — oo 


use integration by parts to show 


(a) I di etiPr'28 A0e”iPi'? = (— ipso) J dt etiPet ¿Opina 


b dix e Hr | Ae iPr = +ip, - da etipe:® Aga ipie | 
f 
Hence show that 


J da I dt ete? (9, AM + AMO, je? 


= —i(pe +phu [da f aretes puerto 


8.2 Verify equation (8.27). 


8.3 Evaluate (8.31) and interpret the result physically (i.e. compare it with 
(8.27)). 


8.4 


(a) Using the u-spinors normalized as in (3.73), the $1? of (8.47), and 
the result for ø - Ac. B from problem 3.4(b), show that 


a „gif pl 1 
ul (sa! = Duta = 1) = (Em) {1+ Rek he xee) 


Etm? (B+mp 


(b) For any vector A = (AL, A?, 43), show that lic. Ag! = A’. Find 
similar expressions for $! a - Ad’, Pio - Ad! gto - Ad?. 


(c) Show that the S of (8.46) is equal to 


ee] eae. 


sem (+ ia Emi 
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(d) Using cos0 = k - k'/(|k||k'|), |k| = |k’| and v = |k|/E, show that 


S = (2E)?(1 — v? sin? 6/2). 


8.5 Verify equation (8.55). 

8.6 Check that 7°y#ty° = y+. 

8.7 Verify equation (8.79) for the lepton tensor Lt”. 

8.8 Evaluate L% as in equation (8.80). 

8.9 Verify equation (8.87). 

8.10 Verify equation (8.96) for the e~st — e~st amplitude to O(e?). 


8.11 Check that both the scalar and the spinor current matrix elements (8.27) 
and (8.55), satisfy 0,j"(x) = 0. 


8.12 Verify equation (8.120). 


8.13 Verify equation (8.136) for the Fourier transform of p(a) given by (8.135). 
Show that the mean square radius of the distribution (8.135) is 12a?. 


8.14 Check the gauge invariance of M,e- given by (8.162), by showing that 
if e, is replaced by k,,, or €% by ki, the result is zero. 


8.15 


(a) The spin-averaged squared amplitude for lowest-order electron Comp- 
ton scattering contains the interference term 


y uu 


ye 
A, A! 5838" 


where (s) and (u) refer to the s- and u-channel processes of fig- 
ure 8.14(a) and (b) respectively. Obtain an expression analogous 
to (8.172) for this term, and prove that it is, in fact, zero. [Hint: 
work in the massless limit, and use relations (J.4) and (J.5).] 


(b) Explain why the term 
(u) yy (w* 
5 MEL MSS 
AA aA 
is given by (8.177) with s and u interchanged. 


8.16 Recalculate the interference term of problem 8.16(a) for the case k? = 
—Q? (but with k’? = p? = p’? = 0), and hence verify (8.181). 
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8.17 


(a) Derive an expression for the spin-averaged differential cross section 
for lowest-order ep scattering in the laboratory frame, defined 
by p” = (M, 0) where M is now the muon mass, and show that it 
may be written in the form 


do do 
dQ 


SE) [= (2/240?) tan?(0/2) 


where the ‘no-structure’ cross section is that of est scattering 
(appendix K) and the electron mass has been neglected. 


(b) Neglecting all masses, evaluate the spin-averaged expression (8.184) 
in terms of s,t and u and use the result 


do r l 
e meri da Merhar) 


te 7 
r,r/3s,s 


to show that the e pu cross section may be written in the form 


do _ 4ra? 1 14 

de 23 e) 
Show also that by introducing the variable y, defined in terms of 
laboratory variables by y = (k — k')/k, this reduces to the result 


do 4ra? 1 
a -a szl + (> y). 


8.18 Consider the process ete” > putu” in the CM frame. 
(a) Draw the lowest-order Feynman diagram and write down the cor- 
responding amplitude. 
(b) Show that the spin-averaged squared matrix element has the form 
(4ra)? 


7 Le) ww Ap)" 


where q? is the square of the total CM energy, and L(e) depends on 
the e” and et momenta and L(p) on those of the pt, yu”. 


(c) Evaluate the traces and the tensor contraction (neglecting lepton 
masses): (i) directly, using the trace theorems; and (ii) by using 
crossing symmetry and the results of section 8.7 for e pu scattering. 
Hence show that 


|M|? = (4ra)? (1 + cos? 6) 
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(a) Total cross sections for e 


High Energy Physics 4th edn, courtesy Cambridge University Press.) 


8.19 Verify equation (8.207). [Hint: as in equation (8.191) the terms in q” 


where 0 is the CM scattering angle, and that the CM differential 
cross section is 
do a? 2 
ma” az + cos” 6). 
Hence show that the total cross section is (see equation (B.18) of 
appendix B) 


o = 4ra? /3q? = 86.8 nb/q?(GeV?). 


Figure 8.20 shows data (a) for o in ete” > pty and ete” > 
rr” and (b) for the angular distribution in ete > Fu”. Note 
that s = q?. The data in figure 8.20(a) agree well with the predic- 
tion above for o. The broken curve in figure 8.20(b) shows the pure 
QED prediction of part (c) for e. 
It is clear that, while the distribution has the general 1+cos? 9 form 
as predicted, there is a small but definite forward—backward asym- 
metry. This arises because, in addition to the y-exchange amplitude 
there is also a Z°-exchange amplitude (see section 22.3 of volume 2) 
which we have neglected. Such asymmetries are an important test 
of the electroweak theory. They are too small to be visible in the 
total cross sections in figure 8.20(a). 


and q” in B*” may be neglected because of the conditions (8.189).] 


re” > wtp and ete” > rtr; (b) differential 
cross section for ete” — putu”. (From D H Perkins 2000 Introduction to 
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8.20 Starting from the expression 


oh 


tong 


u(p') 


(p) 


where q = p' — p and ot” = zih”, yy], use the Dirac equation and properties 
of y-matrices to prove the ‘Gordon decomposition’ of the current 


Taylor & Francis 
Taylor & Francis Group 


http://taylorandfrancis.com 
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Deep Inelastic Electron—Nucleon Scattering 
and the Parton Model 


We have obtained the rules for doing calculations of simple processes in quan- 
tum electrodynamics for particles of spin-0 and spin-4, and many explicit 
examples have been considered. In this chapter we build on these results to 
give an (admittedly brief) introduction to a topic of central importance in par- 
ticle physics, the structure of hadrons as revealed by deep inelastic scattering 
experiments (the equally important neutrino scattering experiments will be 
discussed in volume 2). We do this partly because the necessary calculations 
involve straightforward, illustrative and eminently practical applications of 
the rules already obtained, but, more particularly, because it is from a com- 
parison of these calculations with experiment that compelling evidence was 
obtained for the existence of the point-like constituents of hadrons — quarks 
and gluons — the interactions of which are described by QCD. 


E ÑÑa  —————— ooo 


9.1 Inelastic electron—proton scattering: kinematics and 
structure functions 


At large momentum transfers there is very little elastic scattering: inelastic 
scattering, in which there is more than just the electron and proton in the final 
state, is much more probable. The simplest inelastic cross section to measure 
is the so-called ‘inclusive’ cross section, for which only the final electron is 
observed. This is therefore a sum over the cross sections for all the possible 
hadronic final states: no attempt is made to select any particular state from 
the hadronic debris created at the proton vertex. This process may be repre- 
sented by the diagram of figure 9.1, assuming that the one-photon exchange 
amplitude dominates. The ‘blob’ at the proton vertex indicates our ignorance 
of the detailed structure: X indicates a sum over all possible hadronic final 
states. However, the assumption of one-photon exchange, which is known 
experimentally to be a very good approximation, means that, as in our pre- 
vious examples (cf (8.118) and (8.185)), the cross section must factorize into 
a leptonic tensor contracted with a tensor describing the hadron vertex: 


do ~ LyyW*" (q, p). (9.1) 
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FIGURE 9.1 
Inelastic electron—proton scattering, in one-photon exchange approximation. 


The lepton vertex is well described by QED and takes the same form as 


before: 
Luv = Key + hp + (02/2)9mul: (9.2) 


For the hadron tensor, however, we expect strong interactions to play an im- 
portant role and we must deduce its general structure by our powerful invari- 
ance arguments. We will only consider unpolarized scattering and therefore 
perform an average over the initial proton spins. The sum over final states, X, 
includes all possible quantum numbers for each hadronic state with total mo- 
mentum p’. For an inclusive cross section, the final phase space involves only 
the scattered electron. Moreover, since we are not restricting the scattering 
process by picking out any specific state of X, the energy k’ and the scattering 
angle 0 of the final electron are now independent variables. In W*” (q, p) the 
sum over X includes the phase space for each hadronic state restricted by the 
usual 4-momentum-conserving 6-function to ensure that each state in X has 
momentum p’. Including some conventional factors, we define W+*"(q, p) by 
(see problem 9.1) 


11 ; a 
eWw*(q,p) = SS (Pip, 53 La, (0) [Xs p’) (X; plm, p (0)|p; p, 8) 
s X 


4rM 2 
x (205 (p +q- p’). (9.3) 


How do we parametrize the tensor structure of W*"? As usual, Lorentz in- 
variance and current conservation come to our aid. There is one important 
difference compared with the elastic form factor case of section 8.8. For inclu- 
sive inelastic scattering there are now two independent scalar variables. The 
relation 

p=pt+q (9.4) 


leads to 
p’ =M’ +p- q+? (9.5) 


where M is the proton mass. In this case, the invariant mass of the hadronic 
final state is a variable A 
p“ =W? (9.6) 
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and is related to the other two scalar variables 


p-q= Mv 9.7) 
and (cf (8.223)) 
P = -Q? 9.8) 
by the condition (cf (8.229)) 
2Mv = QR? + W?- M°. 9.9) 


Our invariance arguments lead us to the same tensor structure as for elastic 
electron—proton scattering, but now the functions A(q2), B(q2) are replaced 
by ‘structure functions’ which are functions of two variables, usually taken to 
be v and Q?. The conventional definition of the proton structure functions 
Wi and Wa is 


W+" (q, p) = (=9*" + qq" /q?)Wi(Q?, v) 


+ lp" — (p- 9/4 lp” — (p- a/a M2 W(Q?, v). 
(9.10) 
Inserting the usual flux factor together with the final electron phase space 
leads to the following expression for the inclusive differential cross section for 
inelastic electron—proton scattering (see problem 9.1): 


Ara NV 1 dk’ 
= {| RI ML Wwe ==. l 
j (=) Te py MP WO age OW 


In terms of ‘laboratory’ variables, neglecting electron mass effects, this yields 
(problem 9.2(a)) 


do a? 
—— = — [Wa cos*(0/2) + 2W, sin? (0/2)]. 9.12 
Remembering now that cos@ and k’ are independent variables for inelastic 
scattering, we can change variables from cos@ and k’ to Q? and v, assuming 
azimuthal symmetry for the unpolarized cross section. We have 


Q? = 2kk'(1-— cos) (9.13) 
v = k-k (9.14) 

so that (problem 9.2(b)) 

Pos 1 2 
d(cos 0) dk’ = zy O dv (9.15) 
and 
d? 3 1 

7 dis [Wa cos? (0/2) + 2W, sin? (0/2). (9.16) 


dQ2dv 4k? sint (0/2) kk’ 
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Yet another choice of variables is sometimes used instead of these, namely the 
dimensionless variables 


x= Q?/2Mv (9.17) 
whose significance we shall see in the next section, and 
y=v/k (9.18) 


which is the fractional energy transfer in the laboratory” frame. Note that 
relation (8.224) shows that x = 1 for elastic scattering. The Jacobian for the 
transformation from Q? and v to x and y is (see problem 9.2(b)) 


dQ? dv = 2Mk*y da dy. (9.19) 


We emphasize that the foregoing — in particular (9.3), (9.12) and (9.16) — is all 
completely general, given the initial one-photon approximation. The physics 
is all contained in the y and Q? dependence of the two structure functions W, 
and Wa. 

A priori, one might expect Wı and Wa to be complicated functions of v 
and Q?, reflecting the complexity of the inelastic scattering process. How- 
ever, in 1969 Bjorken predicted that in the ‘deep inelastic region’ — large v 
and Q?, but Q?/v finite — there should be a very simple behaviour. He pre- 
dicted that the structure functions should scale, i.e. become functions not of 
Q? and v independently but only of their ratio Q?/v. It was the verification 
of approximate ‘Bjorken scaling’ that led to the development of the modern 
parton model. We therefore specialize our discussion of inelastic scattering to 
the deep inelastic region. 


În a 


9.2 Bjorken scaling and the parton model 


From considerations based on the quark model current algebra of Gell-Mann 
(1962), Bjorken (1969) was led to propose the following ‘scaling hypothesis’: 
in the limit 
Q? > o 
with x = Q?/2Mv fixed (9.20) 
vV = 00 


the structure functions scale as 


MW,(Q?,v) > F(z) (9.21) 
vW2(Q?,v) > Falo). (9.22) 
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FIGURE 9.2 

Bjorken scaling: the structure function vW 2 (a) plotted against x for different 
Q? values (Attwood 1980, courtesy SLAC) and (b) plotted against Q? for the 
single x value, x = 0.25 (Friedman and Kendall 1972). 


We must emphasize that the physical content of Bjorken’s hypothesis is that 
the functions Fi (x) and F(x) are finite!. 

Early experimental support for these predictions (figure 9.2) led initially to 
an examination of the theoretical basis of Bjorken’s arguments and to the for- 
mulation of the simple intuitive picture provided by the parton model. Closer 
scrutiny of figure 9.2(a) will encourage the (correct) suspicion that, in fact, 
there is a small but significant spread in the data for any given x value. In 
volume 2 we shall give an introduction to the way in which QCD corrections 
to the parton model lead to predictions for logarithmic (in Q?) violations of 
simple scaling behaviour, which are in excellent agreement with experiment. 
These violations are particularly large at small values of x; for x greater than 
about 0.1, the structure functions are substantially independent of Q?, for 
a given x. The scaling predicted by Bjorken is certainly the most immedi- 
ate gross feature of the data, and an understanding of it is of fundamental 
importance. 

How can the scaling be understood? Feynman, when asked to explain 
Bjorken’s arguments, gave an intuitive explanation in terms of elastic scatter- 
ing from free point-like constituents of the nucleon, which he dubbed ‘partons’ 
(Feynman 1969). The essence of the argument lies in the kinematics of elastic 
scattering of electrons by free point-like charged partons: we will therefore be 
able to use the results of the previous chapters to derive the parton model 
results. At high Q? and v it is intuitively reasonable (and in fact the basis for 


It is always possible to write W(Q?,v) = f(a, Q?), say, where f(x,Q?) will tend to 
some function F(a) as Q? — oo with z fixed. F(x) may, however, be zero, finite or infinite. 
The physics lies in the hypothesis that, in this limit, a finite part remains. 
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FIGURE 9.3 
Photon-parton interaction. 


the light-cone and short-distance operator approach (Wilson 1969) to scaling) 
that the virtual photon is probing very short distances and time scales within 
the proton. In this situation, Feynman supposed that the photon interacts 
with small (point-like) constituents within the proton, which carry only a cer- 
tain fraction f of the proton’s energy and momentum (figure 9.3). Over the 
short time scales involved in the transfer of a large amount of energy v, and 
at the short distances probed at large Q?, the struck constituents can perhaps 
be treated as effectively free and independent. (This is in sharp contrast to 
the case of elastic scattering, where the constituents are acting coherently.) 
We then have the idealized elastic scattering process shown in figure 9.4. It 
is the kinematics of the elastic scattering condition for the partons that leads 
directly to a relation between Q? and v and hence to the observed scaling 
behaviour. The original discussion of the parton model took place in the 
infinite-momentum frame of the proton. While this has the merit that it 
eliminates the need for explicit statements about parton masses and so on, it 
also obscures the simple kinematic origin of the scaling. For this reason, at the 
expense of some theoretical niceties, we prefer to perform a direct calculation 
of electron—parton scattering in close analogy with our previous examples. 

We first show that the fraction f is none other than Bjorken’s variable x. 
For a parton of type i we write 


pit = fp!" (9.23) 

and, roughly speaking”, we can imagine that the partons have mass 
mi fM. (9.24) 
Then, exactly as in (8.216) and (8.217), energy and momentum conservation 
2Explicit statements about parton transverse momenta and masses, such as those made 


in equations (9.23) and (9.24), are unnecessary in a rigorous treatment, where such quan- 
tities can be shown to give rise to non-leading scaling behaviour (Sachrajda 1983). 
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FIGURE 9.4 
Elastic electron—parton scattering. 


at the parton vertex, together with the assumption that the struck parton 
remains on-shell (as indicated by the fact that in figure 9.4 the partons are 
free), imply that 

(q+ fp = mi (9.25) 
which, using (9.8), (8.222) and (9.24), gives 


f =Q?/2Mv=c. (9.26) 


Thus the fact that the nucleon structure functions do seem to depend 
(to a good approximation) only on the variable x is interpreted physically as 
showing that the scattering is dominated by the ‘quasi-free’ electron—parton 
process shown in figure 9.4. In section 11.5.3 we shall see how the ‘asymptotic 
freedom’ property of QCD suggests a dynamical understanding of this picture, 
as will be discussed further in chapter 15 of volume 2. 

What sort of values for x do we expect? Consider an analogous situation 
— electron scattering from deuterium. Here the target (the deuteron) is un- 
doubtedly composite, and its ‘partons’ are, to a first approximation, just the 
two nucleons. Since my = $mp, we expect to see the value x = 3 (cf (9.24)) 
favoured; x = 1 here would correspond to elastic scattering from the deuteron. 
A peak at z & 4 is indeed observed (figure 9.5) in quasi-elastic ed scattering 
(the broadening of the peak is due to the fact that the constituent nucleons 
have some motion within the deuteron). By ‘quasi-elastic’ here we mean that 
the incident electron scatters off ‘quasi-free’ nucleons, an approximation we 
expect to be good for incident energies significantly greater than the binding 
energy of the n and p in the deuteron (~2 MeV). What about the nucleon 
itself, then? A simple three-quark model would, on this analogy, lead us to 
expect a peak at x ~ 3, but the data already shown (figure 9.2(a)) do not 
look much like that. Perhaps there is something else present too — which we 
shall uncover as our story proceeds. 

Certainly it seems sensible to suppose that a nucleon contains at least some 
quarks (and also antiquarks) of the type introduced in the simple composite 
models of the nucleon (section 1.2.2). If quarks are supposed to have spin-2, 
then the scattering of an electron from a quark or antiquark — generically a 
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FIGURE 9.5 
Structure function for quasi-elastic ed scattering, plotted against x (Attwood 
1980, courtesy SLAC). 


charged parton — of type i, charge e; (in units of e) is just given by the ep 
scattering cross section (8.228), with obvious modifications: 


Pot TO? 1 ie ¿Rs 
dQ2dy 1k? sin4(0/2) kk' (« cos (0/2) + e; cae (9/2)) 
Ai (9.27) 


This is to be compared with the general inclusive inelastic cross section formula 
written in terms of W; and Wa: 
d2o ma? 1 
—— > — [W cos? (0/2) + W12 sin? (0/2)). 9.28 
dQ2dv  4k2sin1(0/2) pp ee eee en (9.28) 
Thus the contribution to W and W2 from one parton of type i is immediately 
seen to be 


Q? 


Wi = ei ue’ — 0*/2Ma) (9.29) 
Wi = e26(v—Q2/2Mz) (9.30) 


where we have set m; = £M. At large v and Q? it is assumed that the 
contributions from different partons add incoherently in cross section. Thus, 
to obtain the total contribution from all quark partons, we must sum over the 
contributions from all types of partons, 7, and integrate over all values of zx, 
the momentum fraction carried by the parton. The integral over x must be 
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weighted by the probability f; (a) for the parton of type i to have a fraction z of 
momentum. These probability distributions — or parton distribution functions 
(PDFs) — are not predicted by the model and are, in this parton picture, 
fundamental parameters of the proton. The structure function W2 becomes 


1 
W2(v, Q?) = >| dz fi(x)e?6(v — Q?/2M 2). (9.31) 
7 Jo 
Using the result for the Dirac 6-function (see appendix E, equation (E.34)) 
O(a == zo) 
(gl) = == 9.32 
2) = aeons oa 
where zp is defined by g(x0) = 0, we can rewrite 
5(v — Q?/2Mx) = (2/v)ó(x — Q?/2Mv) (9.33) 
under the x integral. Hence we obtain 
vWa(v,Q?) = Y e?afi(x) = Fola) (9.34) 
which is the desired scaling behaviour. Similar manipulations lead to 
MW\(v, Q?) = Fı (x) (9.35) 
where 
2xFı (x) = Fa(x). (9.36) 


This relation between Fi and F> is called the Callan—Gross relation (see 
Callan and Gross 1969): it is a direct consequence of our assumption of spin- 
3 partons. The physical origin of this relation is best discussed in terms of 
virtual photon total cross sections for transverse (A = +1) virtual photons 
and for a longitudinal/scalar (A = 0) virtual photon contribution. The lon- 
gitudinal/scalar photon is present because q? Z 0 for a virtual photon (see 
comment (4) in section 8.3.1). However, in the discussion of polarization 
vectors a slight difference occurs for space-like q?. In a frame in which 


q" = (9,0,0, g?) (9.37) 
the transverse polarization vectors are as before 


e#(\ = £1) = 427 1/?(0, 1, +i, 0) (9.38) 


with normalization (see equation (7.87)) 
e .e=-—1. (9.39) 
To construct the longitudinal/scalar polarization vector, we must satisfy 


q:e=0 (9.40) 
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and so are led to the result 


(à = 0) = (1/-/Q?)(q°, 0,0, 4°) (9.41) 


with 
é(A=0)=+1. (9.42) 


The precise definition of a virtual photon cross section is obviously just a 
convention. It is usually taken to be 


orlyp > X) = (47%a/K)e' (e (A) We" (9.43) 


by analogy with the total cross section for real photons of polarization A 
incident on an unpolarized proton target. Note the presence of the factor W*” 
defined in (9.3). The factor K is the flux factor; for real photons, producing 
a final state of mass W, this is just the photon energy in the rest frame of the 
target nucleon: 

K = (W° — M?)/2M. (9.44) 


In the so-called ‘Hand convention’, this same factor is used for virtual photons 
which produce a final state of mass W. With these definitions we find (see 


problem 9.3) that the transverse (A = £1) photon cross section 
4m2aN 1 A > 

oT = (£) 3 2, Es AJEL AY WE (9.45) 

is given by 
or = (41%0/K)W, 9.46) 

and the longitudinal/scalar cross section 

og =(41%0/K)e,(A =0)ey(A =0)W*” 9.47) 
by 

og = (4n20/K)|(1 + v?/Q?)We — Wi]. 9.48) 


In fact these expressions give an intuitive explanation of the positivity prop- 
erties of W, and W2, namely 


w >0 (9.49) 
(1 +12/Q2)W, — W, > 0. (9.50) 


The combination in the A = 0 cross section is sometimes denoted by Wi: 
Wi = (1 + v?/Q?)W2 — Wy. (9.51) 
The scaling limit of these expressions can be taken using 


UW. > Fo (9.52) 
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FIGURE 9.6 
Photon-parton interaction in the Breit frame. 


and x = Q?/2Mv finite, as Q? and v grow large. We find 


or > ITA nía) (9.54) 
and 
os > (4r°a/M K)(1/2x)(F> — 2x F1) (9.55) 


where we have neglected a term of order MF >/v in the last expression. Thus 
the Callan-Gross relation corresponds to the result 


03/07 — 0 (9.56) 


in terms of photon cross sections. 
A parton calculation using point-like spin-0 partons shows the opposite 
result, namely 
or/os > 0. (9.57) 


Both these results may be understood by considering the helicities of partons 
and photons in the so-called parton Breit or ‘brick-wall’ frame. The partic- 
ular frame is the one in which the photon and parton are collinear and the 
3-momentum of the parton is exactly reversed by the collision (see figure 9.6). 
In this frame, the photon transfers no energy, only 3-momentum. The van- 
ishing of transverse photon cross sections for scalar partons is now obvious. 
The transverse photons bring in +1 units of the z-component of angular mo- 
mentum: spin-0 partons cannot absorb this. Thus only the scalar A = 0 cross 
section is non-zero. For spin-4 partons the argument is slightly more compli- 
cated in that it depends on the helicity properties of the y, coupling of the 
parton to the photon. As is shown in problem 9.4, for massless spin-4 particles 
the y, coupling conserves helicity — i.e. the projection of spin along the direc- 
tion of motion of the particle. Thus in the Breit frame, and neglecting parton 
masses, conservation of helicity necessitates a change in the z-component of 
the parton’s angular momentum by +1 unit, thereby requiring the absorp- 
tion of a transverse photon (figure 9.7). The Lorentz transformation from the 
parton Breit frame to the ‘laboratory’ frame does not affect the ratio of trans- 
verse to longitudinal photons, if we neglect the parton transverse momenta. 
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FIGURE 9.7 


Angular momentum balance for absorption of photon by helicity-conserving 
spin-4 parton. 
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FIGURE 9.8 

The ratio 21. F,/F>: o, 1.5 < Q? < 4 GeV?; e, 0.5 < Q? < 11 GeV’; x, 12 < 
Q? < 16 GeV?. (Figure from D H Perkins Introduction to High Energy Physics 
3rd edn, copyright 1987; reprinted by permission of Pearson Education, Inc., 
Upper Saddle River, NJ.) 


These arguments therefore make clear the origin of the Callan—Gross rela- 
tion. Experimentally, the Callan-Gross relation is reasonably well satisfied 
in that R = 05/07 is small for most, if not all, of the deep inelastic regime 
(figure 9.8). This leads us to suppose that the electrically charged partons 
coupling to photons have spin-3. 
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E 
9.3 Partons as quarks and gluons 


We now proceed a stage further, with the idea that the charged partons are 
quarks (and antiquarks). If we assume that the photon only couples to these 
objects, we can make more specific scaling predictions. The quantum numbers 
of the quarks have been given in Table 1.2. For a proton we have the result 
(cf (9.34)) 


F? (a) = r{#[u(x) + u(x)| + 5 ld(x) +d(2) +s(2)+3(2)+--- (9.58) 


where u(x) is the probability distribution for u quarks in the proton, u(x) for 
u antiquarks and so on in an obvious notation, and the dots indicate further 
possible flavours. So far we do not seem to have gained much, replacing 
one unknown function by six or more unknown functions. The full power of 
the quark parton model lies in the fact that the same distribution functions 
appear, in different combinations, for neutron targets, and in the analogous 
scaling functions for deep inelastic scattering with neutrino and antineutrino 
beams (see volume 2). For electron scattering from neutron targets we can use 
I-spin invariance (see for example Close 1979, or Leader and Predazzi 1996) 
to relate the distribution of u and d quarks in a neutron to the distributions 
in a proton, and similarly for the antiquarks. The results are 


x) (9.59) 
s (9.60) 
s(x) = (x)= s(x) (x)= 5” (x)= s(x). (9.61) 


ES (2) = ef 


cols 


(d(x) + d(x)| + 5 (u(x) +u(xr) + s(x) + 35(x)]+---}. (9.62) 
The quark distributions inside the proton and neutron must satisfy some 


constraints. Since both proton and neutron have strangeness zero, we have a 
sum rule (treating only u, d and s flavours from now on) 


/ oO (9.63) 


Similarly, from the proton and neutron charges we obtain two other sum rules: 


/ dz {3[u(x) — a(2)] — sld(z) — do) = 1 (9.64) 


0 


f du (3 [d(=) — d(x)] — 3 lu(u) ala) = 0. (9.65) 


0 
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These are equivalent to the sum rules 


2 = / dz [u(a) — u(x)] (9.66) 


= 
II 


f dz [d(x) — d(2)] (9.67) 


which are, of course, just the excess of u and d quarks over antiquarks inside 
the proton. Testing these sum rules requires neutrino data to separate the 
various structure functions, as we shall explain in volume 2, chapter 20. 

One can gain some further insight if one is prepared to make a model. For 
example, one can introduce the idea of ‘valence’ quarks (those of the elemen- 
tary constituent quark model) and ‘sea’ quarks (qq pairs created virtually). 
Then, in a proton, the u and d quark distributions would be parametrized by 
the sum of valence and sea contributions 


uy + qs (9.68) 
= dy+4qs (9.69) 


a 2 


while the antiquark and strange quark distributions are taken to be pure sea 
úu=d=s=3=qs (9.70) 


where we have assumed that the ‘sea’ is flavour-independent. Such a model 
replaces the six unknown functions now in play by three, and is consequently 
more predictive. The strangeness sum rule (9.63) is now satisfied automati- 
cally, while (9.66) and (9.67) are satisfied by the valence distributions alone: 


druv(z) = 2 (9.71) 


diedy(x) = 1. (9.72) 
0 
One more important sum rule emerges from the picture of xf;(x) as the 
fractional momentum carried by quark 7. This is the momentum sum rule 


f rae ae) e albă pute si (9.73) 


where e is interpreted as the fraction of the proton momentum that is not 
carried by quarks and antiquarks. The integral in (9.73) is directly related 
to v and 7 cross sections, and its evaluation implies e ~ $ (the CHARM 
(1981) result was 1 — e = 0.44+ 0.02). This suggests that about half the 
total momentum is carried by uncharged objects. These remaining partons 
are identified with the gluons of QCD. They have their own PDF, g(x). 

An enormous effort, both experimental and theoretical, has gone into de- 
termining the parton distribution functions. The subject is regularly reviewed 
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FIGURE 9.9 

Distributions of x times the unpolarized parton distribution functions f(x) 
(where f = uy,dy,u,d,s,c,b,g) and their associated uncertainties using the 
MSTW2008 parametrization (Martin et al. 2009) at a scale u? = 10 GeV? 
and u? = 10,000 GeV”. [Figure reproduced courtesy Michael Barnett, for the 
Particle Data Group, from the review of Structure Functions by B F Foster, 
A D Martin and M G Vincter, section 16 in the Review of Particle Physics, 
K Nakamura et al. (Particle Data Group) Journal of Physics G 37 (2010) 
075021, IOP Publishing Limited.] (See color plate I.) 


by the Particle Data Group (currently Nakamura et al. 2010). Figure 9.9 
shows the result of one analysis. In this much more sophisticated approach, 
which includes higher order QCD corrections, it is necessary to specify a par- 
ticular value of Q? (here denoted by Q? = u?) at which the distributions are 
defined, as explained in chapter 15 of volume 2. The distributions at this 
value are quantities to be determined from experiment. The distributions at 
other values of Q? are then predicted by perturbative QCD. 


The main features of the PDFs shown in figure 9.9 are: the valence quark 
distributions are peaked at around x = 0.2, and go to zero for x — 0 and 
x — 1; the sea quarks, on the other hand, have a high probability of carrying 
very low momentum fractions, as do the gluons — in fact, the gluons dominate 
for x below about 0.1. This is then the picture of ‘what nucleons are made 
of”, as revealed by some 40 years of research. 
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FIGURE 9.10 
Drell-Yan process. 
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9.4 The Drell-Yan process 


Much of the importance of the parton model lies outside its original domain of 
deep inelastic scattering. In deep inelastic scattering it is possible to provide 
a more formal basis for the parton model in terms of light-cone and short- 
distance operator expansions (see chapter 18 of Peskin and Schroeder 1995). 
The advantage of the parton formulation lies in the fact that it suggests other 
processes for which a parton description may be relevant but for which formal 
operator arguments are not possible. One such example is the Drell-Yan 
process (Drell and Yan 1970) 


ptpoutp +X (9.74) 


in which a putu pair is produced in proton-proton collisions along with un- 
observed hadrons X, as shown in figure 9.10. The assumption of the parton 
model is that in the limit 


s — 00 with 7 = q°/s finite (9.75) 


the dominant process is that shown in figure 9.11: a quark and antiquark from 
different hadrons are assumed to annihilate to a virtual photon which then 
decays to a yFpu” pair (compare figures 9.3 and 9.4), the remaining quarks 
and antiquarks subsequently emerging as hadrons. 

Let us work in the CM system and neglect all masses. In this case we have 


pt = (P,0,0, P) p$ = (P,0,0,—P) (9.76) 
and 
s =4P. (9.77) 
Neglecting quark masses and transverse momenta, we have quark momenta 
ph, = xı(P,0,0,P) (9.78) 


Phy = T2(P,0,0,—P) (9.79) 
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FIGURE 9.11 
Parton model amplitude for the Drell-Yan process. 


and the photon momentum 


q = Pq + Poo (9.80) 
has non-zero components 
q = (zı +2)P (9.81) 
q = (21 —22)P. (9.82) 
Thus we find 
q? =4x129P? (9.83) 
and hence 


osi 


The cross section for the basic process 
qq > utu (9.85) 
is calculated using the result of problem 8.18. Since the QED process 


+ 


ete > putu (9.86) 


has the cross section (neglecting all masses) 


o(ete” => pu~) = 4ra? /3¢q? (9.87) 


we expect the result for a quark of type a with charge e, (in units of e) to be 
0(dada >") = (4ra? /3q")e7. (9.88) 


To obtain the parton model prediction for proton-proton collisions, one merely 
multiplies this cross section by the probabilities for finding a quark of type a 
with momentum fraction x, and an antiquark of the same type with fraction 
22, namely 

da(%1) dx1 Ga (x2) da>. (9.89) 
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There is, of course, another contribution for which the antiquark has fraction 
z and the quark z2: 


da(21) dx1 qa(£2) dr. (9.90) 


Thus the Drell-Yan prediction is 


do(pp > Fu" +X) 
7 Ara? (9.91) 
= Sea 


> es [da (271 )da (22) + da(11)qa(22)] des dez 


where we have included a factor 3 to account for the colour of the quarks: 
in order to make a colour singlet photon, one needs to match the colours of 
quark and antiquark. Equation (9.91) is the master formula. Its importance 
lies in the fact that the same quark distribution functions are measured in 
deep inelastic lepton scattering so one can make absolute predictions.? For 
example, if the photon in figure 9.11 is replaced by a W(Z), one can predict 
W(Z) production cross sections, as we shall see in volume 2. 

We would expect some ‘scaling’ property to hold for this cross section, fol- 
lowing from the point-like constituent cross section (9.88). One way to exhibit 
this is to use the variables q? and xp = 21 — x2 as discussed in problem 9.6. 
There it is shown that the dimensionless quantity 


a de 
q TE (9.92) 
should be a function of zp and the ratio 7 = q?/s. The data bear out this 
prediction well — see figure 9.12. 

Furthermore, the assumption that the lepton pair is produced via quark- 
antiquark annihilation to a virtual photon can be checked by observing the 
angular distribution of either lepton in the dilepton rest frame, relative to the 
incident proton beam direction. This distribution is expected to be the same 
as in ete” — wtp, namely (cf (8.194)) 


da /dQ « (1 + cos? 0) (9.93) 


as is indeed observed (figure 9.13). Note that figure 9.13 provides evidence 
that the quarks have spin->: if they are assumed to have spin-0, the angular 
distribution would be (see problem 9.7) proportional to (1 — cos? 0), and this 
is clearly ruled out. 


3QCD corrections make the connection more complicated, but still perturbatively com- 
putable. 
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FIGURE 9.12 

The dimensionless cross section M*d20/dMdxp (M = yq2) at xp = 0 for 
pN scattering, plotted against yr = M/ys (Scott 1985): e, ys = 62 GeV; 
, 44; D, 27.4; O, 23.8. 
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FIGURE 9.13 

Angular distribution of muons, measured in the yu” rest frame, relative 
to the incident beam direction, in the Drell-Yan process. (Figure from D 
H Perkins Introduction to High Energy Physics 3rd edn, copyright 1987; 
reprinted by permission of Pearson Education, Inc., Upper Saddle River, NJ.) 
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FIGURE 9.14 
ete” annihilation to hadrons in one-photon approximation. 
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9.5 ete” annihilation into hadrons 


The last electromagnetic process we wish to consider is electron—positron an- 
nihilation into hadrons (figure 9.14): 


ete” > X. (9.94) 


As usual, the dominance of the one-photon intermediate state is assumed. 
Figure 9.14 is clearly a generalization of figure 8.9, the latter describing the 
particular case in which the final hadronic state is 717. As a preliminary 
to discussing (9.94), let us therefore revisit ete” — ntr” first. 

The O(e?) amplitude is given in equation (8.159). We shall simplify the 
calculation by neglecting both the electron and the pion masses. The spinor 
part of the amplitude is then —20(k1)pyu(k), and the ‘L - T’ product is 16(k - 
pi)(ki - pi). Borrowing the general CM cross section formula (6.129) from 
chapter 6 as in (8.121), and including the pion form factor, we obtain for the 
unpolarized CM differential cross section 


do F?(q?)a? 2 


and the total unpolarized cross section is 


2ra? 


= 2 (2 
o=F"q JE 


(9.96) 
The cross section & contains a 1/q? factor, just like that for ete” > putu” as 
in (9.87), but this ‘pointlike’ behaviour is modified by the square of the form- 
factor, evaluated at time-like q?. When the measured 7 is plotted against q? 
for q? < 1 (GeV)?, a pronounced resonance is seen at q? ~ ms, superimposed 
on the smooth 1/q? background, where m, is the mass of the rho resonance 
(JP = 17 qq state). The interpretation of this is shown in figure 9.15. F(q?) 
should therefore be parametrized as a resonance, as in (6.107) — or a more 
sophisticated version to take account of the fact that the 7’s are emitted in an 
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FIGURE 9.15 
p-dominance of the pion electromagnetic form factor in the time-like (q? > 0) 
region. 


l = 1 state. Just as F?(q?) modified the point-like cross section in the space- 
like region for ext => ext, so here it modifies the point-like (~ 1/q?) 
behaviour in the time-like region. 

Returning now to the process (9.94), the cross section for it is shown as a 
function of CM energy (q2)!/2 in figure 9.16. The general point-like fall-off as 
1/4? is seen, with peaks due to a succession of boson resonances superimposed 
(p, J/w, Y, Z°,...). The 1/q? fall-off is suggestive of a (point-like) parton 
picture and indeed the process (9.94) is similar to the Drell-Yan one: 


pp > wtp +X. (9.97) 


It is natural to imagine that at large q? the basic subprocess is quark—antiquark 
pair creation (figure 9.17). The total cross section for qq pair production is 
then (cf (9.88)) 


a(ete” > qaqa) = (4ra? /3q")e?. (9.98) 


In the vicinity of mesonic resonances such as the p, we can infer that the 
dominant component in the final state is that in which the qq pair is strongly 
bound into a mesonic state, which then decays into hadrons. Away from res- 
onances, and increasingly at larger values of q?, the produced q and q seek to 
separate from the interaction region. As they draw apart, however, the inter- 
action between them increases (recall section 1.3.6), producing more qq pairs, 
together with radiated gluons. In this process, the coloured quarks and glu- 
ons eventually must form colourless hadrons, since we know that no coloured 
particles have been observed (‘confinement of colour’). If one assumes that 
the presumed colour confinement mechanism does not affect the prediction 
(9.98), then we arrive at the result 


o(ete” — hadrons) = (4ra? /3q?) 5 e? (9.99) 


at large q?, where ‘a’ includes all flavours produced at that energy. 
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FIGURE 9.16 

The cross section o for the annihilation process ete” — hadrons, and the 
ratio R (see equation (9.100)), as a function of cm energy. [Figure reproduced 
courtesy Michael Barnett, for the Particle Data Group, from the Review of 
Particle Physics, K Nakamura et al. (Particle Data Group) Journal of Physics 
G 37 (2010) 075021 IOP Publishing Limited.] (See color plate II.) 


FIGURE 9.17 
Parton model subprocess in ete” — hadrons. 
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FIGURE 9.18 
Two-jet event in ete” annihilation from the TASSO detector at the ete” 
storage ring PETRA. 


This model is best tested by taking out the dominant 1/q? behaviour and 
plotting the ratio 


o(ete” — hadrons) 3 2 
=Y e. (9.100) 


ii) 
For the light quarks u, d and s occurring in three colours, we therefore predict 
R=3[(8)* + (3 + (-§)7] =2. (9.101) 


Above the c threshold but below the b threshold we expect R = 2, and 
above the b threshold R = 2. These expectations are in reasonable accord 
with experiment, especially at energies well beyond the resonance region and 
the b threshold, as figure 9.16 shows. In this figure the dotted curve is the 
prediction of the quark-parton model, equation (9.99). The solid curve in- 
cludes perturbative QCD corrections, which we will return to in chapter 15 of 
volume 2. 

The success of this prediction leads one to consider more detailed con- 
sequences of the picture. For example, the angular distribution of massless 
spin-4 quarks is expected to be (cf (8.194) again) 


da/dQ = (a? /4q?)e? (1 + cos? 0) (9.102) 


just as for the putu” process. However, in this case there is an important 
difference: the quarks are not observed! Nevertheless a remarkable ‘memory’ 
of (9.102) is retained by the observed final-state hadrons. Experimentally one 


292 9. Deep Inelastic Electron—Nucleon Scattering and the Parton Model 


W=34 GeV rs 
$ 


Un 
| 


pa 
o 


(1/0) (do/d(cos0)) 
pi 
\ 


0.5 4 


0 0.2 0.4 0.6 0.8 
Icos 0] 


FIGURE 9.19 

Angular distribution of jets in two-jet events, measured in the two-jet rest 
frame, relative to the incident beam direction, in the process ete” — two jets 
(Althoff et al. 1984). The full curve is the (1 + cos? 9) distribution. Since it 
is not possible to say which jet corresponded to the quark and which to the 
antiquark, only half the angular distribution can be plotted. The asymmetry 
visible in figure 8.20(b) is therefore not apparent. 


observes events in which hadrons emerge from the interaction region in two 
relatively well-collimated cones or ‘jets’ — see figure 9.18. The distribution 
of events as a function of the (inferred) angle of the jet axis is shown in 
figure 9.19 and is in good agreement with (9.102). The interpretation is that 
the primary process is ete” — qq, the quark and the antiquark then turning 
into hadrons as they separate and experience the very strong colour forces, 
but without losing the memory of the original quark angular distribution. We 
shall discuss jets more fully in chapter 14 of volume 2, in the context of QCD. 


E a 


Problems 


9.1 The various normalization factors in equations (9.3) and (9.11) may be 
checked in the following way. The cross section for inclusive electron—proton 
scattering may be written (equation (9.11)): 


do = (= 


= 1 d3k! 
q? 4[(k 6 py? _ m2M2]1/2 


4M LW rap (9.103) 


in the usual one-photon exchange approximation, and the tensor W*” is re- 
lated to hadronic matrix elements of the electromagnetic current operator by 
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equation (9.3): 


1 1 


2177 uv Z oe s pu Saal 
e WwW (q, p) — 4rM 2 DAP: slim) X;p) 


x (XD a (0)lp; p, s)(27)*54(p +q — p’) 


where the sum X is over all possible hadronic final states. If we consider the 
special case of elastic scattering, the sum over X is only over the final proton’s 
degrees of freedom: 


v 1 1 s Ap 
ew = TM? SS (mn, sin (0)lp; p, s) (Ds p’, sl (0)|p; p, 8) 
Ï dp! 
x (27)'5(p F q — p’) (27)3 9H a 


Now use equation (8.208) with Fi = 1 and & = 0 (ie. the electromagnetic 
current matrix element for a ‘point’ proton) to show that the resulting cross 
section is identical to that for elastic eu scattering. 


9.2 


(a) Perform the contraction L,,W"” for inclusive inelastic electron- 
proton scattering (remember qL,, = q "Ly = 0). Hence verify 
that the inclusive differential cross section in terms of ‘laboratory’ 
variables, and neglecting the electron mass, has the form 


d?o a? 


dde Taney We cos? (0/2) + W¡2sin?(0/2)]. 


(b) By calculating the Jacobian 


J= ðu/ðx du/0y 
~ | Ov/Ox 0v/0y 


for a change of variables (x, y) > (u, v) 
du dv = |J|dx dy 


find expressions for d?a/dQ? dv and d20/dz dy, where Q? and v 
have their usual significance, and x is the scaling variable Q?/2Mv 
and y = v/k. 


9.3 Consider the description of inelastic electron—proton scattering in terms 
of virtual photon cross sections: 


(a) In the ‘laboratory’ frame with 


p” = (M,0,0,0) and q” =(q°,0,0,¢°) 
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evaluate the transverse spin sum 


4 5 Eul AJG (AW. 


A=+1 


Hence show that the ‘Hand’ cross section for transverse virtual pho- 


tons is 
or = (472a/K)W.. 


(b) Using the definition 


ek = (1//Q?)(q°, 0,0, 9) 


and rewriting this in terms of the ‘laboratory’ 4-vectors p” and q”, 
evaluate the longitudinal /scalar virtual photon cross section. Hence 
show that 

K Q? 


W = —— 5 ; 
dna Q? +? (os + or) 


9.4 In this problem, we consider the representation of the 4 x 4 Dirac matrices 
in which (see (3.40)) 


(o) (1 a) 


as ) and the Dirac four-component 


Define also the 4x 4 matrix y5 = € i 


spinor u = (9) . Then the two-component spinors q, x satisfy 


o-pp = Ep=mx 
o- px = —Ex+ mọ. 


(a) Show that for a massless Dirac particle, $ and x become helicity 
eigenstates (see section 3.3) with positive and negative helicity re- 
spectively. 

(b) Defining 

1+7% Ls 

= P, = 
2 ui 3 

show that Pa = P? = 1, PRP, = 0 = PLPR, and that PR + PL = 1. 

Show also that 


ALJA a=k) 


and hence that Pr and P are projection operators for massless 
Dirac particles, onto states of definite helicity. Discuss what hap- 
pens when m Æ 0. 


Pr 


Problems 


(c) 


(d) 


(e) 


The general massless spinor u can be written 
u = (PL + Pr)u = uL + ur 
where up, ur have the indicated helicities. Show that 


unu =uL y uL + Ur y UR 
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where UL = uly, Ur = ul; and deduce that in electromagnetic 


interactions of massless fermions helicity is conserved. 


In weak interactions an axial vector current ty" y5u also enters. Is 


helicity still conserved? 


Show that the ‘Dirac’ mass term mid may be written as mb, br + 


riL). 


9.5 In the HERA colliding beam machine, positrons of total energy 27.5 GeV 
collide head on with protons of total energy 820 GeV. Neglecting both the 
positron and the proton rest masses, calculate the centre-of-mass energy in 
such a collision process. 

Some theories have predicted the existence of ‘leptoquarks’, which could 
be produced at HERA as a resonance state formed from the incident positron 
and the struck quark. How would a distribution of such events look, if plotted 
versus the variable x? 


9.6 


(a) By the expedient of inserting a 6-function, the differential cross 
section for Drell-Yan production of a lepton pair of mass yq? may 


be written as 
2 


do ds 2 
dq? = fax, dx ade d — 81113). 


Show that this is equivalent to the form 


do 4ra? 
ae = 97 fen dr 11120(11%2 =F) 


x x €2|qa (21 )da(22) + da(11)q4(22)] 


a 


which, since q? = sr, exhibits a scaling law of the form 


s?do/dq? = F(r). 
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(b) Introduce the Feynman scaling variable 
TF = T1 — T2 


with 
q? = 51112 


and show that 
dq? dep = (21 + 22)sdx; du. 


Hence show that the Drell-Yan formula can be rewritten as 


+ E Y Plate áales) + Gar) te) 


dq? dap 9qt (a2 + 47)1/2 


9.7 Verify that if the quarks participating in the Drell-Yan subprocess qq — 
y — up had spin-0, the CM angular distribution of the final ~*~ pair would 
be proportional to (1 — cos? 9). 


Part IV 


Loops and Renormalization 
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10 


Loops and Renormalization I: The ABC 
Theory 


We have seen how Feynman diagrams represent terms in a perturbation theory 
expansion of physical amplitudes, namely the Dyson expansion of section 6.2. 
Terms of a given order all involve the same power of a ‘coupling constant’, 
which is the multiplicative constant appearing in the interaction Hamiltonian 
— for example, ‘g’ in the ABC theory, or the charge “e” in electrodynamics. In 
practice, it often turns out that the relevant parameter is actually the square 
of the coupling constant, and factors of 4m have a habit of appearing on a 
regular basis; so, for QED, the perturbation series is conveniently ordered 
according to powers of the fine structure constant a = e? /4r ~ 1/137. 


Equivalently, this is an expansion in terms of the number of vertices ap- 
pearing in the diagrams, since one power of the coupling constant is associated 
with each vertex. For a given physical process, the lowest-order diagrams (the 
ones with the fewest vertices) are those in which each vertex is connected 
to every other vertex by just one internal line; these are called tree diagrams. 
The Yukawa (u-channel) exchange process of figure 6.4, and the s-channel pro- 
cess of figure 6.5, are both examples of tree diagrams, and indeed all of our 
calculations so far have not gone further than this lowest-order (‘tree’) level. 
Admittedly, since a is after all pretty small, tree diagrams in QED are likely 
to give us a good approximation to compare with experiment. Nevertheless, a 
long history of beautiful and ingenious experiments has resulted in observables 
in QED being determined to an accuracy far better than the O(1%) repre- 
sented by the leading (tree) terms. More generally, precision experiments at 
LEP and other laboratories have an accuracy sensitive to higher-order cor- 
rections in the Standard Model. Hence, some understanding of the physics 
beyond the tree approximation is now essential for phenomenology. 


All higher-order processes beyond the tree approximation involve loops, a 
concept easier to recognize visually than to define in words. In section 6.3.5 
we already met (figure 6.8) one example of an O(g*) correction to the O(g?) 
C-exchange tree diagram of figure 6.4, which contains one loop. The crucial 
point is that whereas a tree diagram can be cut into two separate pieces by 
severing just one internal line, to cut a loop diagram into two separate pieces 
requires the severing of at least two internal lines. 


In these last two chapters of volume 1, we aim to provide an introduc- 
tion to higher-order processes, confining ourselves to ‘one-loop’ order. In the 
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FIGURE 10.1 
O(g*) contribution to the process A +B => A+B, involving the modification 
of the C propagator by the insertion of a loop. 


present chapter we shall concentrate mainly on the particular loop appearing 
in figure 6.8. This will lead us into the physics of renormalization for the ABC 
theory, which — as a Yukawa-like theory — is a good theoretical laboratory for 
studying ‘one-loop physics’, without the complications of spinor and gauge 
fields. In the following chapter, we shall discuss one-loop diagrams in QED, 
emphasizing some important physical consequences, such as corrections to 
Coulomb’s law, anomalous magnetic moments and the running coupling con- 
stant. 


a o oo o ooo B BF ooo A 


10.1 The propagator correction in ABC theory 
10.1.1 The O(g?) self- energy Tr (q?) 


We consider figure 6.8, reproduced here again as figure 10.1. In section 6.3.5, 
we gave the extra rule (‘(iii)’) needed to write down the invariant amplitude 
for this process. We first show how this rule arises in the special case of 
figure 10.1. 

Clearly, figure 10.1 is a fourth-order process, so it must emerge from the 


term 
(—ig)* di d di di 0 A INA 7 
7 xı dz dxs d za (0|âa (p's )âB (Pp) 


x T{ĝa(21)ĝB(z1)ĝc (z1)... Ga (x4)ĝB(z4)ĝo(£4)} 
x âf (pa Jâ} (pe)|0) (16EA EB E, Ep)? (10.1) 
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of the Dyson expansion. Since it is basically a u-channel exchange process 
(u = (pa — ph)? = (pa — pB)2), the vev’s involving the external creation and 
annihilation operators must appear as they do in equation (6.89) (‘ingoing 
A, outgoing B’ at one point x2; ingoing B, outgoing A’ at another point x1’) 
rather than as in equation (6.88) (‘ingoing A and B at 22; outgoing A’ and 
B’ at x1’). In (10.1), however, we unfortunately have four space-time points 
to choose from, rather than merely the two in (6.74). Figuring out exactly 
which choices are in fact equivalent and which are not is best left to private 
struggle, especially since we are not seriously interested in the numerical value 
of our fourth-order corrections in this case. Let us simply consider one choice, 
analogous to (6.89). This yields the amplitude (cf (6.91)) 


ciot ff] dex, dx, dzs dia, el (Pa— PB): 21 eil Pp —PA)z2 


x (0)T($0(11)0(12)b4(23)ó68(23)90(23)0 (4) Op (v4) oc (w4)}|0) 

(10.2) 

and we have discarded the numerical factor 1/4!. Once again, there are many 
terms in the expansion of the vev of the eight operators in (10.2). But, with 


an eye on the structure of the Feynman amplitude at which we are aiming 
(figure 10.1), let us consider again just a single contribution 


(—ig)* H ditai da dtz dta4 ei(PA—PB)-21 gi(Pp—PA)-e2 


x (0T (ĝc(z1)ĝc(z3))|0) (0T (dc (@2) dc (w4))|0) 
x (0|T(a (x3) Ga (wa))|0) (0|T (on (ws ) bp (x4))10) (10.3) 
which contains four propagators connected as in figure 10.2. 
As we saw in section 6.3.2, each of these propagators is a function only 
of the difference of the two space-time points involved. Introducing relative 


oord manes z = T1 — T3, Y = T2 — T4, Z = £3 — La and the CM coordinate 
X = $ (x1 + 22 + 23 + 24), we find (problem 10.1) that (10.3) becomes 


(—ig) es PAFPB—PA— pe) X ci i(p, —pp)-(30—y+22)/4 
x ella) -#489-22)/4 D(x) Do(y)Da(2)Da(2) (10.4) 


where D; is the position-space propagator for type-i particles (i = A,B,C), 
defined as in (6.98). The integral over X gives the expected overall 4-momentum 
conservation factor, (27)%6%(p4 +p—pa —pB). Setting q = pa — ph = PA —PB 
(where 4-momentum conservation has been used), (10.4) becomes 


(—ig)*(2m)*04 pr + Pg — DA — PB) H dix dty dz e? De(x) 


x e i1YDo(y)eiT2 Da (z)Dp(z). (10.5) 
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FIGURE 10.2 
The space-time structure of the integrand in (10.3). 


The integrals over x and y separate out completely, each being just the 
Fourier transform of a C propagator — that is, the momentum-space prop- 
agator Dela). Since the latter is a function of q? only, we end up with two 
factors of i/(q2 — mé + ie), corresponding to the two C propagators in the 
momentum-space Feynman diagram of figure 10.1. Note that the Mandel- 
stam u-variable is defined by u = (pa — pg)? and is thus equal to q”; we shall, 
however, continue to use q? rather than u in what follows. 

The remaining factor represents the loop. Including (—ig)? for the two 
vertices in the loop, it is given by 


(—ig)? J dtz eù? Da (z)Dg(z) (10.6) 


which is the main result of our calculation so far. Since we want to end 
up finally with a momentum-space amplitude, let us introduce the A and B 
propagators in momentum space, and write (10.6) as (cf (6.99)) 


; d ky Bis i dt ko Fi AR i 
= y2 4 iq-z ek z —iko-z 
g d*ze q 1 e 2 
Fig) T ¡E =. Rua ca k2 — mă + ie 


E — d*k, i i 
(27) (27)? k? — ma + ie k? — m2 + ie 
E + ds - q) 


dk i 
af Ne PIE AER (CON 


=i), (10.8) 
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where we have defined the function inb (q?) as the loop (or ‘bubble’) am- 
plitude appearing in figure 10.1. It is a function of q?, as follows from Lorentz 
invariance. The |?! refers to the two powers of g, as will be explained shortly, 
after (10.15). 

Careful consideration of the equivalences among the various contractions 
shows that the amplitude corresponding to figure 10.1 is, in fact, just the 
simple expression 

+ \2 44 / i +712] 2 i 
(—i9) "Em 5 (pa + PB — pa — PB) 33 (~illg (9°) 


ao e Pee 


(10.9) 
where ne (q?) is given in (10.8). We see that whereas the ‘single-particle’ 
pieces, involving one C-exchange, do not involve any integral in momentum- 
space, the loop (which involves both A and B particles) does involve a momen- 
tum integral. This can be simply understood in terms of 4-momentum conser- 
vation, which holds at every vertex of a Feynman graph. At the top (or bot- 
tom) vertex of figure 10.1, the 4-momentum q of the C-particle is fully deter- 
mined by that of the incoming and outgoing particles (q = pa — ph = Ph —PB)- 
This same 4-momentum q flows in (and out) of the loop in figure 10.1, but 
nothing determines how it is to be shared between the A- and B-particles; 
all that can be said is that if the 4-momentum of A is k (as in (10.7)) then 
that of B is q— k, so that their sum is q. The ‘free’ variable k then has to be 
integrated over, and this is the physical origin of rule (iii) of section 6.3.5. 

We have devoted some time to the steps leading to expression (10.7), not 
only in order to follow the emergence of rule (iii) mathematically, but so as to 
lend some plausibility to a very important statement: the Feynman rules for 
associating factors with vertices and propagators, which we learned for tree 
graphs in chapters 6 and 8, also work, with the addition of rule (iii), for all 
more complicated graphs as well! Having seen most of just one fairly short 
calculation of a higher-order amplitude, the reader may perhaps now begin to 
appreciate just how powerful is the precise correspondence between ‘diagrams 
and amplitudes”, given by the Feynman rules. 

Having arrived at the expression for our first one-loop graph, we must 
at once draw the reader's attention to the bad news: the integral in (10.7) is 
divergent at large values of k. We shall postpone a more detailed mathematical 
analysis until section 10.3.1, but the divergence can be plausibly inferred just 
from a simple counting of powers: there are four powers of k in the numerator 
and four in the denominator, and the likelihood is that the integral diverges 
as e k3dk/k* ~ nA, as A > oo. This is plainly a disaster: a quantity 
which was supposed to be a small correction in perturbation theory is actually 
infinite! Such divergences, occurring as loop momenta go to infinity, are called 
‘ultraviolet divergences’, and they are ubiquitous in quantum field theory. 
Only after a long struggle with these infinities was it understood how to obtain 
physically sensible results from such perturbation expansions. Depending on 
the type of field theory involved, the infinities can often be ‘tamed’ through a 
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procedure known as renormalization, to which we shall provide an introduction 
in this and the following chapter. 

The physical ideas behind renormalization are, however, just as relevant 
in cases — such as condensed matter physics — where the analogous higher- 
order (loop) corrections are not infinite, though possibly large. In quantum 
mechanics, infinite momentum corresponds to zero distance, and our fields 
are certainly ‘point-like’. But in condensed matter physics there is generally a 
natural non-zero smallest distance — the lattice size, or an atomic diameter, for 
example. In quantum field theory, such a ‘shortest distance’ would correspond 
to a ‘highest momentum’, meaning that the magnitudes of loop momenta 
would run from zero up to some finite limit A, say, rather than infinity. Such 
a A is called a (momentum) ‘cut-off’. With such a cut-off in place, our loop 
integrals are of course finite — but it would seem that we have then maltreated 
our field theory in some way. However, we might well ask whether we seriously 
believe that any of our quantum field theories is literally valid for arbitrarily 
high energies (or arbitrarily small distances). The answer is surely no: we are 
virtually certain that ‘new physics’ will come into play at some stage, which is 
not contained in — say — the QED, or even the Standard Model, Lagrangian. 
At what scale this new physics will enter (the Planck energy? 1 TeV?) we 
do not know, but surely the current models will break down at some point. 
We should not be too alarmed, therefore, by formal divergences as A > co. 
Rather, it may be sensible to regard a cut-off A as standing for some ‘new 
physics’ scale, accepting some such manoeuvre as physically realistic as well 
as mathematically prudent. 

At the same time, however, we would not want our physical predictions, 
made using quantum field theories, to depend sensitively on A — i.e. on the 
unknown short-distance physics, in this interpretation. Indeed, theories exist 
(for example, those in the Standard Model and the ABC theory) which can be 
reformulated in such a way that all dependence on A disappears, as A > 00; 
these are, precisely, renormalizable quantum field theories. Roughly speaking, 
a renormalizable quantum field theory is one such that, when formulae are 
expressed in terms of certain ‘physical’ parameters taken from experiment, 
rather than in terms of the original parameters appearing in the Lagrangian, 
calculated quantities will be finite and independent of A as A > co. 

Solid state physics provides a close analogy. There, the usefulness of a 
description of, say, electrons in a metal in terms of their ‘effective charge’ and 
‘effective mass’, rather than their free-space values, is well established. In this 
analogy, the free-space quantities correspond to our Lagrangian values, while 
the effective parameters correspond to our ‘physical’ ones. In both cases, the 
interactions are causing changes to the parameters. 

It is clear that we need to understand more precisely just what our ‘physi- 
cal parameters’ might be and how they might be defined. This is what we aim 
to do in the remainder of the present section, and in the next one, before re- 
turning in section 10.3 to the mathematical details associated with evaluating 
(10.7), and indicating how renormalization works for the self-energy. Having 
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FIGURE 10.3 
O(g) term in A+ B > A +B, involving the insertion of two loops in the C 
propagator. 


thus prepared the ground, we shall introduce a more powerful approach in 
section 10.4, and offer a few preliminary remarks about ‘renormalizability’ 
in section 10.5, returning to that topic at the end of the following chapter. 
Although usually not explicitly indicated, loop corrections considered in this 
and the following section will be understood to be defined with a cut-off A, 
so that they are finite. 


To begin the discussion of the physical significance of our O(g*) correction, 
(10.9), it is convenient to consider both the O(g?) term (6.100) and the O(g*) 
correction together, obtaining 


(—ig)?(2m)*6*(p', + pg — pa — PB) 


i i | i 
C 


-mhi -mé q 


where the ie in the C propagators does not need to be retained. Both the 
form of (10.10), and inspection of figure 10.1, suggest that the O(g*) term 
we have calculated can be regarded as an O(g?) correction to the propagator 
for the C-particle. Indeed, we can easily imagine adding in the O(g°) term 
shown in figure 10.3, and in fact the whole infinite series of such ‘bubbles’ 
connected by simple C propagators. The infinite geometric series for the 
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FIGURE 10.4 
Series of one-loop (or ‘bubble’) insertions in the C propagator. 


corrected propagator shown in figure 10.4 has the form 


i i 112] 2 
teza ile ja a 
Pm qomo I me 


q -m q? — me q? — MÁ 
(10.11) 
= ae Jej (10.12) 
where 
r = TPI (4?)/(4? — mă). (10.13) 


The geometric series in (10.12) may be summed, at least formally!, to give 
(1—r)~1 so that (10.12) becomes 
i 1 i 
— PO. N a CE Mi a. 0 OI 10.14) 
z m2 ( 
P — me 1 — 0E) — m2)  q2 — m2, — 116 (a?) 


In this form it is particularly clear that we are dealing with corrections to the 
simple C propagator i/(q? — mé). ne is called the O(g?) self-energy. 
Before proceeding with the analysis of (10.14), we note that it is a special 


case of the more general expression 
Dola?) = — (10.15) 
ga me, — Ic (a?) 

where Do(q2) is the complete (including all corrections) C propagator, and 
Ilc(q2) is the sum of all ‘insertions’ in the C line, excluding those which 
can be cut into two separate bits by severing a single line: IIc(q?) is the 
one-particle irreducible self-energy and we must exclude all one-particle bits 
from it as they are already included in the geometric series summation (cf 
(10.11)). The amplitude ne which we have calculated is simply the lowest- 
order (O(g?)) contribution to Ilc(q2); an O(g?) contribution to Ilc(q2) is 
shown in figure 10.5. 


1Properly speaking this is valid only for |r| < 1, yet we know that m1) (42) actually 
diverges! As we shall see, however, renormalization will be carried out after making such 
quantities finite by ‘regularization’ (section 10.3.2), and then working systematically at a 
given order in g (section 10.4). 
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A B 


FIGURE 10.5 
O(g*) contribution to Ilo(q2). 


10.1.2 Mass shift 


We return to the expression (10.14) which includes the effect of all the iterated 
O(g?) bubbles in the C propagator, where Ie (q?) is given by 

as a] ER i i 

ile (4°) = (~ig) e E (10.16) 
Postponing the evaluation of (10.16) (and in particular the treatment of its 
divergence) until section 10.3, we proceed to discuss the further implications 
of (10.14). 

First, suppose no were simply a constant, óm¿, say. In the absence of this 
correction, we know (cf section 6.3.3) that the vanishing of the denominator 
of the C propagator would correspond to the ‘mass-shell condition” q? = mă 
appropriate to a free particle of momentum q and energy go = (q? +m2)12, 
where mc is the mass of a C particle. It seems very plausible, therefore, 
to interpret the constant óm¿ as a shift in the (mass)? of the C particle, 
the denominator of (10.14) now vanishing at qo = (q? + m2, + m3 )!/?, if 
ne = dm2,. The idea that the mass of a particle can be changed from its ‘free 
space’ value by the presence of interactions with its ‘environment’ is a familiar 
one in condensed matter physics, as noted above. In the case of electrons in 
a metal, for example, it is not surprising that the presence of the lattice ions, 
and the attendant band structure, affect the response of conduction electrons 
to external fields, so that their apparent inertia changes. In the present case, 
the ‘environment’ is, in fact, the vacuum. The process described by the bubble 
ne (q?) is one in which a C particle dissociates virtually into an A-B pair, 
which then recombine into the C particle, no other ‘external’ source being 
present. As in earlier uses of the word, by ‘virtual’ here is meant a process in 
which the participating particles leave their mass-shells. Thus, in particular, 
in the expression (10.16) for IT), it will in general be the case that k? 4 mă, 
and (q — k}? 4 mf. 

In the case of the electron in a metal, both the “free” and the “effective” 
masses are measurable quantities. But we cannot get outside the vacuum! 


308 10. Loops and Renormalization I: The ABC Theory 


This strongly suggests that what we must mean by ‘the physical (mass) of 
a particle in our ABC theory is not the ‘free’ (Lagrangian) value m?, which 
is unmeasurable, but the effective (mass)? which includes all vacuum inter- 
actions. This “physical (mass)? may be defined to be that value of q? for 
which 

q — m? —II;(q?) =0 (10.17) 
where II;(q2) is the complete one-particle irreducible self-energy for particle 
type ‘i’. If we call the physical mass mpn,i, then, we will have q? — m? — 
I: (q?) = 0 when q? = m2, ,. 

What we are dealing with in (10.14) is just the lowest-order contribution 
to Ilo(q2), namely ne (q*), so that in our case ms, y is determined by the 
condition 

q? — mé — ne (q )=0 when q? = Món. O> (10.18) 


which (to this order) is 
2 
mr, 0 = me + IE! (m2, 0). (10.19) 


Once we have calculated Ie (see section 10.3), equation (10.19) could Be 
regarded as an equation to N Món, c in terms of the parameter Ne, 
which appeared in the original ABC Lagrangian. This might, indeed, be the 
way such an equation would be viewed in condensed matter physics, where we 
should know the values of the parameters in the Lagrangian. But in the field- 
theory case má is unobservable, so that such an equation has no predictive 
value. Instead, we may regard it as an equation determining (up to O(g?)) 
mg in terms of m2), y, thus enabling us to eliminate — to this order in y — all 
occurrences of the unobservable parameter me from our amplitudes in favour 
of the physical parameter mac Note that no contains two powers of g, so 
that in the spirit of systematic perturbation theory, the mass shift represented 
by (10.19) is a second-order correction. 

The crucial point here is that ne depends on the cut-off A, whereas the 
physical mass merc clearly does not. But there is nothing to stop us suppos- 
ing that the unknown and unobservable Lagrangian parameter mé, depends 
on A in just such a way as to cancel the A-dependence of ne , leaving meno 
independent of A. This is the beginning of the ‘renormalization procedure’ in 


quantum field theory. 


10.1.3 Field strength renormalization 


We now need to consider the more realistic case in which ne (q?) is not a 
constant. Let us expand it about the point q? = mé, y, writing 


15 (9?) = 15) (m3, 0) + (02 -m, JE + (10.20) 
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The corrected propagator (10.14) then becomes 
i 
anf! 


2 
q = me = 06 (mi, 0) = (q? = Ton 0) “qa 


22 
PRO 
1 


. (10.22) 
+ O(q* — món o)” 


Dia 
q =M ph, C 


= qn 
2 2 C 
(q Bi M5h,C) E _ dq? 


The expression (10.22) has indeed the expected form for a Physical C’ propa- 
gator, having the simple behaviour ~1/(q? —m2,, 0) for q? = my, cœ- However, 
the normalization of this (corrected) propagator is different from that of the 
‘free’ one, i/(q? — m¿,), because of the extra factor 


[2] an 
E ane | | 
=m? ce 


dq? 
To the order at which we are working (O(g?)), it is consistent to replace this 
expression by 


an! 


1 
+e 


P=Mn,c 
Let us see how this factor may be understood. 

Our O(g?) corrected propagator is an approximation to the exact propaga- 
tor which we may write as (2|T(¢c(21)¢c(x2))|Q), in coordinate space, where 
|) is the exact vacuum. The free propagator, however, is (0|T(dc(21)c(x2))|0) 
as calculated in section 6.3.2. Consider one term in the latter, 0(t, — t2)x 
(Ol6c(a1)óc(z2)10), and insert a complete set of free-particle states “1 = 
Y, |n)(n between the two free fields, obtaining 


A(t1 — t2) })(0ldc(a1)|n)(n|oc(w2)|0). (10.23) 


n 


The only free particle state |n) having a non-zero matrix element of the free 
field go to the vacuum is the 1 — C state, for which (0|éc(x)|C, k) =e 4% as 
we learned in chapters 5 and 6. Thus (10.23) becomes (cf equation (6.92)) 


dk —iw —t2 iK- = 
at — ta) | Sage k(t ta) Hike (21-22) (10.24) 


which is exactly the first term of equation (6.92). Consider now carrying out a 
similar manipulation for the corresponding term of the interacting propagator, 
obtaining 


A(t — t2) ) (Olc (21) (nl dc(a2)|2) (10.25) 


n 
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where the states |n) are now the exact eigenstates of the full Hamiltonian. The 
crucial difference between (10.23) and (10.25) is that in (10.25), multi-particle 
states can appear in the states În). For example, the state |A, B) consisting 
of an A particle and a B particle will enter, because the interaction couples 
this state to the 1-C states created and destroyed in da: indeed, just such an 
A+B state is present in me) This means that, whereas in the free case the 
‘content’ of the state (Olóc(x) was fully exhausted by the 1 — C state |C, k) 
(in the sense that all overlaps with other states |n) were zero), this is not so 
in the interacting case. The ‘content’ of (Qlóo(w) is not fully exhausted by 
the state |C, k): rather, it has overlaps with many other states. Now the sum 
total of all these overlaps (in the sense of SI nf) must be unity. Thus 


it seems clear that the ‘strength’ of the single matrix element (Q|¢c(x)|C, k) 
in the interacting case cannot be the same as the free case (where the single 
state exhausted the completeness sum). However, we expect it to be true that 
(Q\dc(x)|C, k) is still basically the wavefunction for the C-particle. Hence we 
shall write 


(Qléc(z)]C,k) = VZce** (10.26) 


where Zc is a constant to take account of the change in normalization — 
the renormalization, in fact — required by the altered ‘strength’ of the matrix 
element. 

If (10.26) is accepted, we can now imagine repeating the steps leading from 
equation (6.92) to equation (6.98) but this time for (Q\T(dc(21)dc(x2))|Q), 
retaining explicitly only the single-particle state |C, k) in (10.25), and using 
the physical (mass)?, mao We should then arrive at a propagator in the 
interacting case which has the form 


(Q\T(dc(a1)oc(#2))|2) = ¡Gel iZo 


(27)? k? — mé y tie 


+ multiparticle contributions) (10.27) 
The single-particle contribution in (10.27) — after undoing the Fourier trans- 
form — has exactly the same form as the one we found in (10.22), if we identify 


the field strength renormalization constant Ze with the proportionality factor 
in (10.22), to this order: 


Zo ~ ZE =1+ 


(10.28) 


This is how the change in normalization in (10.22) is to be interpreted. 

It may be helpful to sketch briefly an analogy between this ‘renormaliza- 
tion’ and a very similar one in ordinary quantum mechanical perturbation 
theory. Suppose we have a Hamiltonian H = Hj) + V and that the |n) are 
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a complete set of orthonormal states such that Ho|n) = EO In). The exact 
eigenstates |n) satisfy a o 
(Ho + V)|n) = En). (10.29) 


To obtain |n) and En in perturbation theory, we write 


În) = VNaln) +) cinli) (10.30) 


where, if |n) is also normalized, we have 
1= Nat >> leinl?. (10.31) 


N,, cannot be unity, since non-zero amounts of the states |i) (i 4 n) have been 
‘mixed in’ by the perturbation- just as the A + B state was introduced into 
the summation Y, |n) (nl, in addition to the 1 — C state. Inserting (10.30) 
into (10.29) and taking the bracket with (j| yields 


Cin = EA (10.32) 


~ (0) 
Ej” — En 


which is still an exact expression. The lowest non-trivial approximation to 
Cjn is to take |n) ~ V/N,|n) and E, = El? in (10.32), giving 


(|V In) Vin 
Cin S —V Nn = Y Nv + (10.33) 
J ES al EO) ES 7 EO) 


Equation (10.31) then gives N, as 


0 0 
Naw (14 VPE - EO) 1 I Va P/E — BY 
J J 
(10.34) 
to second order in V;,,. The reader may ponder on the analogy between (10.34) 
and (10.28). 


10.2 The vertex correction 


At the same order (g*) of perturbation theory, we should also include, for 
consistency, the processes shown in figures 10.6(a) and (b). Figure 10.6(a), 
for example, has the general form 


E: 
-i9 7 —igG” (pa, ph)) (10.35) 
qr — ma 
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FIGURE 10.6 
O(g*) contributions to A +B > A + B, involving corrections to the ABC 
vertices in figure 6.4. 


where —igGP! is the ‘triangle’ loop, given by an expression similar to (10.16) 
but with a factor (—ig)? and three propagators. The ‘vertex correction’ Gl! 
depends on just two of its external 4-momenta because the third is determined 
by 4-momentum conservation, as usual. Thus, the addition of figure 10.6(a) 
and the O(g?) C-exchange tree diagram gives 


i | | 
—ig-——x (ig + (—igG”! (pa, pp))} (10.36) 
q — Ma 


from which it seems plausible that GP! will contribute — among other effects 
— to a change in g. This change will be of order g?, since we may write the 
{...} bracket in (10.36) as 


—ig{1 + GP! (pa, pp)} (10.37) 


where GPI is dimensionless and contains a g? factor — hence the superscript 
[2]. 

Once again, the effect of interactions with the environment (i.e. vacuum 
fluctuations) has been to alter the value of a Lagrangian parameter away from 
the ‘free’ value. In the case of g the change is analogous to that in which an 
electron in a metal acquires an ‘effective charge’. How we define the ‘physical 
g’ is less clear than in the case of the physical mass and we shall not pursue 
this point here, since we shall discuss it again in the more interesting case of 
the charge ‘e’ in QED, in the following chapter. At all events, some suitable 
definition of ‘gpn’ can be given, so that it can be related to g after the relevant 
amplitudes have been computed. 

Let us briefly recapitulate progress. We are studying higher-order (one- 
loop) corrections to tree graph amplitudes in the ABC model, which has the 
Lagrangian density: 


Ê= X 110,9:0"0; — 4m?4?} — gbadsdc. (10.38) 
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(a) (b) 


FIGURE 10.7 
Elementary one-loop amplitudes: (a) self-energy; (b) vertex correction. 


We have found that the loops considered so far, namely those in figures 10.1 
and 10.5, have the following qualitative effects: 


(i) the position of the single-particle mass-shell condition becomes shifted 
away from the ‘Lagrangian’ value m? to a ‘physical’ value m 
given by the vanishing of an expression such as (10.17); 


2 
ph,i 


(îi) the vacuum-to-one-particle matrix elements of the fields ¢; have to 
be ‘renormalized’ by a factor VZ;, given by (10.28) to O(g?) for 
1=C, and these factors have to be included in S-matrix elements; 


iii) the propagators contain some contribution from two-particle states 
propag 
(e.g. “A + B’ for the C propagator); 


(iv) the Lagrangian coupling g is shifted by the interactions to a ‘phys- 
ical’ value gpn. 


Responsible for these effects were two ‘elementary’ loops, that for 118! shown 
in figure 10.7(a) and that for —igG?! shown in figure 10.7(b). It is noteworthy 
that the effects (i), (ii) and (iv) all relate to changes (renormalizations, shifts) 
in the fields and parameters of the original Lagrangian. We say, collectively, 
that the ‘fields, masses and coupling have been renormalized’ — i.e. generi- 
cally altered from their ‘free’ values, by the virtual interactions represented 
generically by figures 10.7(a) and (b). However, whereas in condensed matter 
physics one might well have the ambition to calculate such effects from first 
principles, in the field-theory case that makes no sense. Rather, by rewriting 
all calculated expressions (at a given order of perturbation theory) in terms 
of ‘renormalized’ quantities, we aim to eliminate the ‘unknown physics scale’, 
A, from the theory. Let us now see how this works in more mathematical 
detail. 
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10.3 Dealing with the bad news: a simple example 
10.3.1 Evaluating II? 2] (q2) 


We turn our attention to the actual evaluation of a one-loop amplitude, be- 
ginning with the simplest, which is -inb (q2): 


im! (q?) = (ig)? l A : : (10.39) 
—i = (-i =; i 
o (d I (27)t k2 — m2 + ie (q — k)? — m2 + ie 
in particular, we want to know the precise mathematical form of the divergence 
which arises when the momentum integral in (10.39) is not cut off at an upper 
limit A. This will necessitate the introduction of a few modest tricks from a 
large armoury (mostly due to Feynman) for dealing with such integrals. 

The first move in evaluating (10.39) is to ‘combine the denominators’ using 
the identity (problem 10.2) 


du 


AB Jj, (1—2)A+aBP (1020) 


(similar ‘Feynman identities’ exist for combining three or more denominator 
factors). Applying (10.40) to (10.39) we obtain 


4 
inl) = a f co f 2, 


1 
eee (10.41 
* a) m +e) tah mă ic) (041) 
Collecting up terms inside the [...] bracket and changing the integration vari- 
able to k’ = k — xq leads to (problem 10.3) 
¿sd 
in (q = fa ef 10.42 
i (k? — A + ie)? a + ie)? ( ) 
where 
A = —z(1 — 2) + em + (1- 2)má. (10.43) 


The d*k’ integral means dk’ d°k’, and k”? = (k)? — k”. 

We now perform the k integration in (10.42) for which we will need the 
contour integration techniques explained in appendix F. The integral we want 
to calculate is 


oo dk’? dk” O 9 
| ~ [WAF OA |. TR) A] = gal) ae 


where A = k”? + A — ie. We rewrite I(A) as 


I(A) = im f 7 (10.45) 
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Imk” 


FIGURE 10.8 
Location of the poles of (10.42) in the complex k’°-plane. 


where the contour Cp is the real axis from —R to R. Next, we identify the 
points where the integrand [z? — A ceases to be analytic (called ‘poles’), 
which are at z = +VA = (k? +A —ie)'/?. Figure 10.8 shows the location of 
these points in the complex DA m Ma note that the ‘ie’ determines in which 
half-plane each point lies (compare the similar role of the ‘ie’ in (z+ie)~+, in the 
proof in appendix F of the representation (6.93) for the 0-function). We must 
now ‘close the contour’ in order to be able to use Cauchy’s integral formula 
of (F.19). We may do this by means of a large semicircle in either the upper 
(C+) or lower (C_) half-plane (again compare the discussion in appendix F). 
The contribution from either such semicircle vanishes as R — oo, since on 
either we have z = Rei”, and 


Rei 2 
|, or C_ = a => R2e2i0 — A > 0 as R > 00. (10.46) 


For definiteness, let us choose to close the contour in the upper half-plane. 
Then we are evaluating 


z dz 
1D ae e ee VA 


around the closed contour C shown in figure 10.9, which encloses the single 
non-analytic point at z = —VA. Applying Cauchy’s integral formula (F.19) 
with a = —VA and f(z) = (z — VA)~!, we find 


I(A) = 2ni (10.48) 


—2VA 


and thus 


su, dk’? Ti 
L [k0 — AR — 24372 oe 
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FIGURE 10.9 
The closed contour C used in the integral (10.47). 


The reader may like to try taking the other choice (C_) of closing contour, 
and check that the answer is the same. Reinstating the remaining integrals in 
(10.42) we have finally (as e > 0) 


u? du 
—in? (q )= =39 a of GPA) ARO (10.50) 


where u = |k'| and the integration over the angles of k’ has yielded a factor 
of 47. We see that the u-integral behaves as | du/u for large u, which is 
logarithmically divergent, as expected from the start. 


10.3.2 Regularization and renormalization 


Faced with results which are infinite, one can either try to go back to the 
very beginnings of the theory and see if a totally new start can avoid the 
infinities or one can see if they can somehow be ‘lived with’. The first approach 
may yet, ultimately, turn out to be correct: perhaps a future theory will be 
altogether free of divergences (such theories do in fact exist, but none as yet 
successfully describes the pattern of particles and forces we actually seem to 
have in Nature). For the moment, it is the second approach which has been 
pursued — indeed with great success as we shall see in the next chapter and 
in volume 2. 

Accepting the general framework of quantum field theory, then, the first 
thing we must obviously do is to modify the theory in some way so that 
integrals such as (10.50) do not actually diverge, so that we can at least discuss 
finite rather than infinite quantities. This step is called ‘regularization’ of the 
theory. There are many ways to do this but for our present purposes a simple 
one will do well enough, which is to cut off the u-integration in (10.50) at some 
finite value A (remember u is |k'|, so A here will have dimensions of energy, 
or mass); such a step was given some physical motivation in section 10.1.1. 
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Then we can evaluate the integral straightforwardly and move on to the next 
stage. 

With the upper limit in (10.50) replaced by A, we can evaluate the u- 
integral, obtaining (problem 10.4) 


2,2 pl 2 1/2 
[21/2 ,2) _ 79 A+ (A* +A) A 
NoMa =a J, (Soe "Grapes (0951) 


where from (10.43) 


A=-—x(1—2)q? + em + (1- 2)má. (10.52) 


Note that A > 0 for q? < 0. 

Inspection of (10.51) shows that as A — oo, ne (q, A?) contains a diver- 
gent part proportional to ln A. It is useful to isolate this divergent part, as 
follows. For large A, we can expand the terms in (10.51) in powers of A/A?, 
writing 


A 
A+ (A? + A)? = 2A(1 + ae Fe) (10.53) 
and A 
It follows that 
[2]; 2 1423 __ —9 i 1 
T! Jo 


where terms that go to zero as A — œ have been omitted. 
Relation (10.19) then becomes 


me(A*) = Món, O = ab (47 = Mn Or A?) (10.56) 


and there will be similar relations for the A and B masses. As noted previously, 
after (10.19), the shift represented by (10.56) is in an O(g?) perturbative 
correction (because ne contains a factor g2), so that — again in the spirit 
of systematic perturbation theory — it will be adequate to this order in g? to 
replace the Lagrangian masses mă, mg, and ma, inside the expressions for 
nb, ne and ne by their physical counterparts. In this way the relations 
(10.56) and the two similar ones give us the prescription for rewriting the m? 
in terms of the mani and A?. Of course, when this is done in the propagators, 
the result is just to produce the desired form ~(q* — my, ¿)7*, to this order. 

So, for the propagator at this one-loop order, the effect of such mass shifts 
is essentially trivial: the large A behaviour is simply absorbed into m?. What 
about Zc? This was defined via (10.28) in terms of the quantity 


ANZI 
dq? 


(10.57) 


Di 32 
4=M5,C 
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However, equation (10.55) shows that the divergent part of ne is independent 
of q?, or equivalently that the quantity (10.57) is finite. It follows that Zc is 
finite in this theory. In other theories, quantities analogous to (10.55) might 
contain a q?-dependent divergence, which would be formally absorbed in the 
rescaling represented by Zc. 

We may also analyse the vertex correction Gl] of figure 10.6, and conclude 
that it too is finite, because there are now three propagators giving six powers 
of k in the denominator, with still only a four-dimensional d*k integration. 
Once again, the analogous vertex correction in QED is divergent, as we shall 
see in chapter 11; there too this divergence can be absorbed into a redefinition 
of the physical charge. The ABC theory is, in fact, a ‘super-renormalizable’ 
one, meaning (loosely) that it has fewer divergences than might be expected. 
We shall come back to the classification of theories (renormalizable, non- 
renormalizable and super-renormalizable) at the end of the following chapter. 

While it is not our purpose to present a full discussion of one-loop renor- 
malization in the ABC theory (because it is not of any direct physical interest) 
we will use it to introduce one more important idea before turning, in the next 
chapter, to one-loop QED. 


E ——].—————_——————0 ooo —————— 
10.4 Bare and renormalized perturbation theory 
10.4.1 Reorganizing perturbation theory 


We have seen that, of the one-loop effects listed at the end of section 10.2, the 
mass shifts given by equations such as (10.14) do involve formal divergences 
as A — oo, but the vertex correction and field strength renormalization are 
finite in the ABC theory. We shall find that in QED the corresponding quan- 
tities are all divergent, so that the perturbative replacement of all Lagrangian 
parameters by their ‘physical’ counterparts, together with field strength renor- 
malizations, is mandatory in QED in order to get rid of In A terms. However, 
this process — of evaluating the connections between the two sets of param- 
eters, and then inserting them into all the calculated amplitudes — is likely 
to be very cumbersome. In this section, we shall introduce an alternative 
formulation, which has both calculational and conceptual advantages. 

By way of motivation, consider the QED analogue of the divergent part 
of equation (10.7), which contributes a correction to the bare electron mass 
of the form am In(A/m) where m is the electron mass. At A = 100 GeV the 
magnitude of this is about 0.04 MeV (if we take m to have the physical value), 
which is a shift of some 10%. The application of perturbation theory would 
seem more plausible if this kind of correction were to be included from the 
start, so that the ‘free’ part of the Hamiltonian (or Lagrangian) involved the 
physical fields and parameters, rather than the (unobserved) ones appearing 
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in the original theory. Then the main effects, in some sense, would already be 
included by the use of these (empirical) physical quantities, and corrections 
would be ‘more plausibly’ small. This is indeed the main reason for the useful- 
ness of such ‘effective’ parameters in the analogous case of condensed matter 
physics. Actually, of course, in quantum field theory the corrections will be 
just as infinite (if we send A to infinity) in this approach also, since whichever 
way we set the calculation up, we shall get loops, which are divergent. All the 
same, this kind of ‘reorganization’ does offer a more systematic approach to 
renormalization. 
To illustrate the idea, consider again our ABC Lagrangian 


L= Loa + Lop + Loc + Lint (10.58) 


where Ă i Ă N 
Loc = 40n¢c0" bc — mide (10.59) 
and similarly for Loa Lon; and where 


Lint = ~geoaorec. (10.60) 


There are two obvious moves to make: (i) introduce the rescaled (renor- 
malized) fields by 
Dna) = Zi dul) (10.61) 


in order to get rid of the yZ; factors in the S-matrix elements; and (ii) 
introduce the physical masses Mõni Consider first the non-interacting parts 
of L, namely 


Lo = Loa + Lop + Foc. (10.62) 


Singling out the C-parameters for definiteness, Lo can then be written as 


Lo = $2Zc0uopn,cO"dpn.c — EMA Zen O ++ 
= $Onbpn,cO" bpn,c E dm? cno 
+4(Zo — 1)Oudpn,cO“dpn,c — (MZe — men.o)ben,c +--+ (10.63) 
= Lopnot+ {46ZcOudpn,cO" bph,c 
= 3(6Zomen 0 + 6meZc)ben o} apt (10.64) 


where Loph,c is the standard free-C Lagrangian in terms of the physical field 
and mass, which leads to a Feynman propagator i/(k? — mac + ie) in the 
usual way; also, dZo = Zc — 1 and óm¿ = mé — mâna. In (10.64) the dots 
signify similar rearrangements of Ĉo, a and Lop. Note that Ze and m2, are 
understood to depend on A, as usual, although this has not been indicated 
explicitly. _ _ 

We now regard Lopn,A + LophB + Lopn,c’ as the ‘unperturbed’ part of £, 
and all the remainder of (10.64) as perturbations additional to the original Lo 
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FIGURE 10.10 


Counter term corresponding to the terms in braces in (10.64). 


(much of theoretical physics consists of exploiting the identity ‘a+b = (a+c)+ 
(b—c)’). The effect of this rearrangement is to introduce new perturbations, 
namely 45ZcO,bpn, cor oph, c and the ¢? sn, term in (10.64), together with 
similar terms for the A and B fields. Such additional perturbations are called 
‘counter terms’ and they must be included in our new perturbation theory 
based on the Loph,i pieces. As usual, this is conveniently implemented in 
terms of associated Feynman diagrams. Since both of these counter terms 
involve just the square of the field, it should be clear that they only have 
non-zero matrix elements between one-particle states, so that the associated 
diagram has the form shown in figure 10.10, which includes both these C- 
contributions. Problem 10.5 shows that the Feynman rule for figure 10.10 
is that it contributes i[ăZok? — (Zom? y + 6m%Zc)] to the 1C + 1C 
amplitude. 

The original interaction term Lx may also be rewritten in terms of the 
physical fields and a physical (renormalized) coupling constant gpn: 


-gbase = —g(ZaZ Zo) ?b0m,AÓpn,BÓph,C 
= ~gpndpn,adph,Bdph,c — (Zv — 1)gpndpn,adph,Boph,c 
(10.65) 
where 
Zvgph = 9(ZaZgZo)'”. (10.66) 


3 


The interpretation of (10.66) is clearly that ‘gph’ is the coupling constant 
describing the interactions among the Poni fields, while the (Zy — 1)’ term 
is another counter term, having the structure shown in figure 10.11. 

In summary, we have reorganized £ so as to base perturbation theory 
on a part describing the free renormalized fields (rather than the fields in 
the original Lagrangian); in this formulation we find that, in addition to the 
(renormalized) ABC-interaction term, further terms have appeared which are 
interpreted as additional perturbations, called counter terms. These counter 
terms are determined, at each order in this (renormalized) perturbation the- 
ory, by what are basically self-consistency conditions — such as, for example, 
the requirement that the propagators really do reduce to the physical ones 
at the ‘mass-shell’ points. We shall now illustrate this procedure for the C 
propagator. 
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FIGURE 10.11 
Counter term corresponding to the ‘(Zy — 1)’ term in (10.66). 


10.4.2 The O(93,,) renormalized self-energy revisited: how 
counter terms are determined by renormalization con- 
ditions 


Let us return to the calculation of the C propagator, following the same pro- 
cedure as in section 10.1, but this time ‘perturbing’ away from Loph.i and 
including the contribution from the counter term of figure 10.10, in addition 
to the O(9%n) self energy. The expression (10.14) will now be replaced by 


i 


ee o AA | CS (10.67) 
q? — m2, ot Q25Zc — Zema y — 6măZe — WE} c(q2, A?) 
where 
d*k i i 
ig! 2 A2 [iezi >| : 
Mc A= (Gon | Oromia te E Fe 
(10.68) 


and where we have indicated the cut-off dependence on the left-hand side, 
leaving it understood on the right. Comparing (10.68) with (10.39) we see 
that they are exactly the same, except that Hs involves the ‘physical’ cou- 
pling constant gph and the physical masses, as expected in this renormalized 
perturbation theory. In particular, Me will be divergent in exactly the same 


way as na, as the cut-off A goes to infinity. 


The essence of this ‘reorganized’ perturbation theory is that we now de- 
termine $Zo and dm¿, from the condition that as q? > m2, c, the propagator 
(10.67) reduces to i/(q? — m2, c), i.e. it correctly represents the physical C 
propagator at the mass-shell point, with standard normalization. Expanding 
Ue le") about q? = m=), y then, we reach the approximate form of (10.67), 
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valid for q? ~ mac: 


i 


am! 
2 h,C 
(a? = mpo) Zo-0m4 Zo- a(m?,, o, A?)—(¢? iai minc) de — 
4 =M ph, C 
(10.69) 
Requiring that this has the form i/(q? — mé, q) gives 
condition (a) ¿m2 = -Zg MIE ¿(m?,, o, A?) 
[2] 
condition (b) Zo=1+ E n (10.70) 
q“ =M ph,C 


Looking first at condition (b), we see that our renormalization constant Ze 
has, in this approach, been determined up to O(g?) by an equation that is, in 
fact, very similar to (10.28), but it is expressed in terms of physical parameters. 
As regards (a), since Ze = 1+ O(9n), it is sufficient to replace it by 1 on 


the right-hand side of (a), so that, to this order, 6mă = TT Ge wah): 
Once again, this is similar to (10.56), but written in terms of the physical 
quantities from the outset. We indicate that these evaluations of Ze and dm, 
are correct to second order by adding a superscript, as in ZE l, 

Of course, we have not avoided the infinities (in the limit A — 00) in this 


approach! It is still true that the loop integral in nt c diverges logarithmi- 


cally and so the mass shift (mE? is infinite as A —> oo. Nevertheless, this 
is a conceptually cleaner way to do the business. It is called ‘renormalized 
perturbation theory’, as opposed to our first approach which is called ‘bare 
perturbation theory’. What we there called the ‘Lagrangian fields and pa- 
rameters’ are usually called the ‘bare’ ones; the ‘renormalized’ quantities are 
‘clothed’ by the interactions. 

We may now return to our propagator (10.67), and insert the results 
(10.70) to obtain the final important expression for the C propagator con- 
taining the one-loop O(gen) renormalized self-energy: 


(10.71) 


where 
[2] 
—(2] 2 2 dil h,C 
Mon, c (4) ale A7)= Te ma A%)- (Mar. ae um” 
CM, C 


(10.72) 
We remind the reader that us (q2, A?) has exactly the same form as ne (q2, A?) 
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except that y? and m; are replaced by g, and ms. From (10.55) it then 
follows that, as A — 00, 


2 
Ipoh d h Ip h 
ME al A) = i In = gy (In n2-1)+ ón = T dz In A(z,q°), (10.73) 


and hence 


2) ,2 42 [2] 2 2 d | A(z, q?) 
Ipod A) — Hip cl Mpn, 0) A”) = q, f da In Meta) (10.74) 


which is finite as A — oo. It is also clear from (10.73) that dTi /dq? 


finite as A > co. Thus the quantity TÄ olè) is finite as A > oo, and 
is understood to be evaluated in that limit; the subtraction in (10.74) has 
removed the infinity. The additional subtraction in (10.72) would in fact 
have removed a logarithmic divergence in Zc, had there been one. Note that 
the form of (10.72) guarantees that the leading behaviour of TH o) near 
q? = Món o is (47 — mŽp,c)°, so that the behaviour of (10.71) near the mass- 
shell point is indeed i/(q? — mé, y) as desired. 

A succinct way of summarizing our final renormalized result (10.71), with 
the definition (10.72), is to say that the C propagator may be defined by 


(10.71) where the O(g¿) renormalized self-energy Ue satisfies the renor- 
malization conditions 
2 
Hoh, ola? = a c) =0 q hcla’) =0. (10.75) 


2m2 
q =M hC 


Relations analogous to (10.75) clearly hold for the A and B self-energies also. 
In this definition, the explicit introduction and cancellation of large-A terms 
has disappeared from sight, and all that remains is the importation of one 
constant from experiment, MnO and a (hidden) rescaling of the fields. It is 
useful to bear this viewpoint in mind when considering more general theories, 
including ones that are ‘non-renormalizable’ (see section 11.8 of the following 
chapter). 

There is a lot of good physics in the expression (10.71), which we shall elu- 
cidate in the realistic case of QED in the next chapter. For the moment, we 
just whet the reader’s appetite by pointing out that (10.71) must amount to 
the prediction of a finite, calculable correction to the Yukawa 1 — C exchange 
potential, which after all is given by the Fourier transform of the (static form 
of) the propagator, as we learned long ago. In the case of QED, this will 
amount to a calculable correction to Coulomb’s law, due to radiative correc- 
tions, as we shall discuss in section 11.5.1. 

There is an important technical implication we may draw from (10.75). 
Consider the Feynman diagram of figure 10.12 in which a propagator correc- 
tion has been inserted in an external line. This diagram is of order Joh» and 
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FIGURE 10.12 
O(g*) contribution to A +B > A + B, involving a propagator correction 
inserted in an external line. 


should presumably be included along with the others at this order. However, 

the conditions (10.75) — in this case written for misi A — imply that it vanishes. 

Omitting irrelevant factors, the amplitude for figure 10.12 is 
=(2] 1 1 


Don, (PA) ===3— 


= (10.76) 
PA — Mon,a d — Mon c 


and we need to take the limit pă => mf; a since the external A particle is 


on-shell. Expanding pi A about the point pă = m2), 4 and using conditions 
(10.75) for C + A we see that (10.76) vanishes. Thus with this definition, 
propagator corrections do not need to be applied to external lines. 


E ooo e —— — ooo _—————— 


10.5 Renormalizability 


We have seen how divergences present in self-energy loops like figure 10.7(a) 
can be eliminated by supposing that the ‘bare’ masses in the original La- 
grangian depend on the cut-off in just such a way as to cancel the divergences, 
leaving a finite value for the physical masses. The latter are, however, param- 
eters to be taken from experiment: they are not calculable. Alternatively, we 
may rephrase perturbation theory in terms of renormalized quantities from the 
outset, in which case the loop divergence is cancelled by appropriate counter 
terms; but again the physical masses have to be taken from experiment. We 
pointed out that, in the ABC theory, neither the field strength renormaliza- 
tions Z; nor the vertex diagrams of figure 10.5 were divergent, but we shall see 
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(a) (b) 


FIGURE 10.13 
(a) O(g*) one-loop contribution to A + B => A + B; (b) counter term that 
would be required if (a) were divergent. 


in the next chapter that the analogous quantities in QED are divergent. These 
divergences too can be absorbed into redefinitions of the ‘physical’ fields and 
a ‘physical’ coupling constant (the latter again to be taken from experiment). 
Or, again, such divergences can be cancelled by appropriate counter terms in 
the renormalized perturbation theory approach. 


In general, a theory will have various divergences at the one-loop level, 
and new divergences will enter as we go up in order of perturbation theory (or 
number of loops). Typically, therefore, quantum field theories betray sensitiv- 
ity to unknown short-distance physics by the presence of formal divergences 
in loops, as a cut-off A + co. In a renormalizable theory, this sensitivity can 
be systematically removed by accepting that a finite number of parameters 
are uncalculable, and must be taken from experiment. These are the suitably 
defined ‘physical’ values of the masses and coupling constants appearing in 
the Lagrangian. Once these parameters are given, all other quantities are 
finite and calculable, to any desired order in perturbation theory — assuming, 
of course, that terms in successive orders diminish sensibly in size. 

Alternatively, we may say that a renormalizable theory is one in which a 
finite number of counter terms can be so chosen as to cancel all divergences 
order by order in renormalized perturbation theory. Note, now, that the only 
available counter terms are the ones which arise in the process of ‘reorganizing’ 
the original theory in terms of renormalized quantities plus extra bits (the 
counter terms). All the counter terms must correspond to masses, interactions, 
etc which are present in the original (or ‘bare’) Lagrangian — which is, in fact, 
the theory we are trying to make sense of! We are not allowed to add in any 
old kind of counter term — if we did, we would be redefining the theory. 

We can illustrate this point by considering, for example, a one-loop (O(g*)) 
contribution to AB — AB scattering, as shown in figure 10.13(a). If this graph 
is divergent, we will need a counter term with the structure shown in fig- 
ure 10.13(b) to cancel the divergence — but there is no such ‘contact’ AB — AB 
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interaction in the original theory (it would have the form AGA (2)03 (2). In 
fact, the graph is convergent, as indicated by the usual power-counting (four 
powers of k in the numerator, eight in the denominator from the four propa- 
gators). And indeed, the ABC theory is renormalizable — or rather, as noted 
earlier, ‘super-renormalizable’. 

We shall have something more to say about renormalizability and non- 
renormalizability (is it fatal?), at the end of the following chapter. The first 
and main business, however, will be to apply what we have learned here to 
QED. 


ARA a 


Problems 


10.1 Carry out the indicated change of variables so as to obtain (10.4) from 
(10.3). 


10.2 Verify the Feynman identity (10.40). 
10.3 Obtain (10.42) from (10.41). 


10.4 Obtain (10.51) from (10.50), having replaced the upper limit of the u- 
integral by A. 


10.5 Obtain the Feynman rule quoted in the text for the sum of the counter 
terms appearing in (10.64). 


11 


Loops and Renormalization II: QED 


The present electrodynamics is certainly incomplete, but is no longer cer- 
tainly incorrect. 


—F. J. Dyson (1949b) 


We now turn to the analysis of loop corrections in QED. As we might expect, 
a theory with fermionic and gauge fields proves to be a tougher opponent than 
one with only spinless particles, even though we restrict ourselves to one-loop 
diagrams only. 

At the outset we must make one important disclaimer. In QED many 
loop diagrams diverge not only as the loop momentum goes to infinity (‘ul- 
traviolet divergence’) but also as it goes to zero (‘infrared divergence’). This 
phenomenon can only arise when there are massless particles in the theory — 
for otherwise the propagator factors ~(k? — M?)~! will always prevent any 
infinity at low k. Of course, in a gauge theory we do have just such mass- 
less quanta. Our main purpose here is to demonstrate how the ultraviolet 
divergences can be tamed and we must refer the reader to Weinberg (1995, 
chapter 13), or to Peskin and Schroeder (1995, section 6.5), for instruction in 
dealing with the infrared problem. The remedy lies, essentially, in a careful 
consideration of the contribution, to physical cross sections, of amplitudes in- 
volving the real emission of very low frequency photons, along with infrared 
divergent virtual photon processes. It is a ‘technical’ problem, having to do 
with massless particles (of which there are not that many), whereas ultraviolet 
divergences are generic. 


race a 


11.1 Counter terms 


We shall consider the simplest case of a single fermion of bare mass mo and 
bare charge eo (eo > 0) interacting with the Maxwell field, for which the bare 
(i.e. actual!) Lagrangian is 


1 


-Æ (9- Âo)? (11.1) 


é A és = an La xii 
L = po lið — mo) Wo — eoo poou — ¡Lo ko 
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— NUI 
(a) (b) (c) 
FIGURE 11.1 


Counter terms in QED: (a) electron mass and wavefunction; (b) photon wave- 
function; (c) vertex part. 


according to chapter 7. We shall adopt the ‘renormalized perturbation theory’ 
approach and begin by introducing field strength renormalizations via 


y = Z bo (11.2) 
Ae = Zi? Ae (11.3) 


where the ‘physical’ fields and parameters will now simply have no ‘0’ sub- 
script. This will lead to a rewriting of the free and gauge-fixing part of (11.1): 


e A do dl 
Wo (id — mo) 40 — gE oT 3, 7 Ao)” 
= VI må — Thy PM — gp AY 


+ [(Za — 1)6i0Ú — mhh] — (23 - ÊP” (11.4) 


where £ = £0/Z3 and ôm = moZ2 — m (compare (10.64)). We see the emer- 
gence of the expected “4...” and ‘Ê - Ê’ counter terms in (11.4), affecting 
both the fermion and the gauge-field propagators. Next, we write the in- 
teraction in terms of a physical e, and the physical fields, together with a 
compensating third counter term: 


—eothyydoAoy = -eby pA, = (Za = Te, (11.5) 
where, with the aid of (11.2) and (11.3), 
Zie = e0222}. (11.6) 


The three counter terms are represented diagrammatically as shown in fig- 
ures 11.1(a), (b) and (c), for which the Feynman rules are, respectively, 


(a): i[f(Z2 — 1) — ôm] 
(b): —i(g”k? — k" k”)(Zs — 1) (11.7) 
(e):  —iey*(Z1 — 1). 
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a 


(a) 


(c) 


(b) 


FIGURE 11.2 
Elementary one-loop divergent diagrams in QED. 


These counter terms will compensate for the ultraviolet divergences of the 
three elementary loop diagrams of figure 11.2, and in fact they are sufficient 
to eliminate all such divergences in all QED loops. 

Before proceeding further we remark that we already have a first indication 
that renormalizing a gauge theory presents some new features. Consider the 
two counter terms involving Za — 1 and Zi — 1; their sum gives 


pli(Z2 — 1)9 — e(Z, — 1) Al (11.8) 


which is not of the ‘gauge principle’ form ‘ig — eA’! Unless, of course, Zi = 
Zə. This relation between the two quite different renormalization constants 
is, in fact, true to all orders in perturbation theory, as a consequence of a 
Ward identity (Ward 1950), which is itself a consequence of gauge invariance. 
We shall discuss the Ward identity and Zi = Z 2 at the one loop level in 
section 11.6. 


11.2 The O(e?) fermion self-energy 


In analogy with in, the amplitude corresponding to figure 11.2(a) is the 


fermion self-energy —i52l where 


—i i d*k 
lo = (ci el Co ee 11.9 
ih" (p) = (ie) | —2 Fm" (ny (11.9) 
and we have now chosen the gauge € = 1. As expected, the d*k integral 
in (11.9) diverges for large k — this time more seriously than the integral in 
na, because there are only three powers of k in the denominator of (11.9) 
as opposed to four in (10.7). Once again, we need to choose some form of 


330 11. Loops and Renormalization II: QED 


regularization to make (11.9) ultraviolet finite. We shall not be specific (as 
yet) about what choice we are making, since whatever it may be the outcome 
will be qualitatively similar to the ne case. 

There is, however, one interesting new feature in this (fermion) case. As 
previously indicated, power-counting in the integral of (11.9) might lead us to 
expect that — if we adopt a simple cut-off — the leading ultraviolet divergence 
of EP] would be proportional to A rather than mA. This is because we 
have that one extra power of k in the numerator and XP! has dimensions 
of mass. However, this is not so. The leading p-independent divergence is, 
in fact, proportional to mln(A/m). The reason for this is important and 
it has interesting generalizations. Suppose that m in (11.4) were set equal 
to zero. Then, as we saw in problem 9.4, the two helicity components vr 
and Ur of the electron field will not be coupled by the QED interaction. 
It follows that no terms of the form OS or Ya can be generated, and 
hence no perturbatively induced mass term, if m = 0. The perturbative mass 
shift must be proportional to m and therefore, on dimensional grounds, only 
logarithmically divergent. 

There is also a p-dependent divergence of the self-energy, of which warning 
was given in section 10.3.2. As in the scalar case, this will be associated with 
the field strength renormalization factor Z2. It is proportional to pIn(A/m) 
(Za is the coefficient of Ø in (11.8), which leads to p in momentum space). The 
upshot is that the fermion propagator, including the one-loop renormalized 
self-energy, is given by 

i 


mE) (11.10) 


where (cf (10.74)) 


dy?! 
dp un 
Whatever form of regularization is used, the twice-subtracted NP! will be 


finite and independent of the regulator when it is removed. In terms of the 
‘compensating’ quantities Z2 and mo — m, we find (problem 11.1, cf (10.70)) 


Elp) = DPI (p) — SPI = m) — (p-m) (11.11) 


dy 
dp 


Za =1+ mo — m = -Z7 EP (y = m). (11.12) 


p=m 


Note that, as in the case of me), the definition (11.11) of EP implies that 
propagator corrections vanish for external (on-shell) fermions. The quantities 
Zz and mo determined by (11.12) now carry a superscript ‘[2]’ to indicate that 
they are correct at O(e?). 

We must now remind the reader that, although we have indeed eliminated 
the ultraviolet divergences in XE] by the subtractions of (11.11), there remains 
an untreated infrared divergence in d/?! / dp. To show how this is dealt with 
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would take us beyond our intended scope, as explained at the start of the 
chapter. Suffice it to say that by the introduction of a ‘regulating’ photon 
mass 17, and consideration of relevant real photon processes along with virtual 
ones, these infrared problems can be controlled (Weinberg 1995, Peskin and 
Schroeder 1995). 


11.3 The O(e?) photon self-energy 


The amplitude corresponding to figure 11.2(b) is ma (q) where 


4 i i 
im (@) = ie [Spa En (11.13) 
= cecul due E EO 


Once again, this photon self-energy is analogous to the scalar particle self- 
energy of chapter 10. There are two new features to be commented on in 
(11.14). The first is the overall ‘—1’ factor, which occurs whenever there is a 
closed fermion loop. The keen reader may like to pursue this via problem 11.2. 
The second feature is the appearance of the trace symbol “Tr”: this is plausible 
as the amplitude is basically a 1y — 1y one with no spinor indices, but again 
the reader can follow that through in problem 11.3. 

We now want to go some way into the calculation of m2) because it will, 
in the end, contain important physics — for example, corrections to Coulomb’s 
law. The first step is to evaluate the numerator trace factor using the theorems 
of section 8.2.3. We find (problem 11.4) 


Trl(¢@tk+m)yk+ mw] = Mau + ku)ky + (qv + by) ky 
— gu((q-k) +k? —m?)}. (11.15) 


We then use the Feynman identity (10.40) to combine the denominators, yield- 
ing 


1 1 i 
[(q +)? — m2]? — m3] | da (11.16) 


where k' = k+2xq, A, = —x(1—2)q?+m? (note that A, is precisely the same 
as A of (10.43) with ma = mg = m) and we have reinstated the implied ‘ie’. 
Making the shift to the variable k’ in the numerator factor (11.15) produces 
a revised numerator which is 


A{ 2k! ki, —9uw(k?—Ay)-2x(1-2)(qu4v —9uvg")+terms linear in k’} (11.17) 


where the terms linear in k’ will vanish by symmetry when integrated over k’ 
in (11.14). Our result so far is therefore 
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-4e | da [= Ph Sw 

i En | (k2 = A +19? (k? A, Fig) 
— (l-z) 

+ 8e? (qudv — Jurg? fa = | om EA (11.18) 


Consider now the ultraviolet divergences of (11.18), adopting a simple 
cut-off as a regularization. The terms in the first line are both apparently 
quadratically divergent, while the integral in the second line is logarithmically 
divergent. What counter terms do we have to cancel these divergences? The 
answer is that the ((Z3—1)' counter term of figure 11.1(b) is of exactly the right 
form to cancel the logarithmic divergence in the second line of (11.18), but 
we have no counter term proportional to the g,, term in the first line. Note, 
incidentally, that we can argue from Lorentz covariance (see appendix D) that 


dk El 
l Or (k2 A, Fie? F(Ay) Iu (11.19) 


= 
=e 
uN, 
= 
ie) 
N 
Ww 
| 


so that taking the dot product of both sides with g“” we deduce that 


dk! 2k,,k;, 1 dtk’ Ada mos 
lave ete i | ae ee oe (11.20) 


It follows that both the terms in the first line of (11.18) produce a divergence 
of the form ~A*g,,, and they do not cancel, at least in our simple cut-off 
regularization. 

A term proportional to gj, is, in fact, a photon mass term. A Lagrangian 
mass term for the photon would have the form im? Guv Ab Ax , which af- 
ter introducing the rescaled Au will generate a counter term proportional to 
Juv Â! Â”, and an associated Feynman amplitude proportional to g,,. But 
such a term mă, violates gauge invariance! (It is plainly not invariant un- 
der (7.69).) Evidently the simple momentum cut-off that we have adopted 
as a regularization procedure does not respect gauge invariance. We saw in 
section 8.6.2 that gauge invariance implied the condition 


qT, =0 11.21 
H 


where q is the 4-momentum of a photon entering a one-photon amplitude T),. 
Our discussion of (11.21) was limited in section 8.6.2 to the case of a real 
external photon, whereas the photon lines in na are internal and virtual; 
nevertheless it is still true that gauge invariance implies (Peskin and Schroeder 
1995, section 7.4) 


ght?) = 1 =0. (11.22) 


Condition (11.22) is guaranteed by the tensor structure (ququ — Guvg*) of the 
second line in (11.18), provided the divergence is regularized. As previously 
implied, a simple cut-off A suffices for this term, since it does not alter the 
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tensor structure, and the A-dependence can be compensated by the ‘Z3 — 1’ 
counter term which has the same tensor structure (cf figure 11.2(b)). But 
what about the first line of (11.18)? Various gauge-invariant regularizations 
have been used, the effect of all of which is to cause the first line of (11.18) to 
vanish. The most widely used, since the 1970s, is the dimensional regulariza- 
tion technique introduced by ’t Hooft and Veltman (1972), which involves the 
‘continuation’ of the number of space-time dimensions from four to d (< 4). 
As d is reduced, the integrals tend to diverge less, and the divergences can be 
isolated via the terms which diverge as d > 4. Using gauge-invariant dimen- 
sional regularization, the two terms in the first line of (11.18) are found to 
cancel each other exactly, leaving just the manifestly gauge invariant second 
line (see appendix O of volume 2). 

We proceed to the next step, renormalizing the gauge-invariant part of 
ma). 


11.4 The O(e?) renormalized photon self-energy 


The surviving (gauge-invariant) term of nh is 


— x(1— zx) 
na (ay = 
AO) = 8? (Quau -PIu if de | iG? A, +P (11.23) 


= iC gw — 94H (q). (11.24) 


The dk! integral in (11.23) is exactly the same as the one in (10.42), with A 
replaced by A,. It contains a logarithmic divergence, which we regulate as 
before by i a cut-off A, so that we are dealing with the gauge-invariant 
quantity TI? (q?, A2). The calculation leading to (10.55) then tells us that, as 
A > 00, 


e f! 1 
m0) =—S | da {ind + (n2—1)— gina. (11.25) 


The analogue of (10.11) is then (in the gauge € = 1) 


—19uwv —ig : o o —19ov 
+ E fag — geg UP (2, A?) — 
q q q 
—ig : o o —igor 
H EP igg — qPq” IL (q?, A?) 1 
q q 
TF T —ig ia 
ilg” — q7g”) OPa, A?) Pa Jaa 
= —19uv Ao 2 — spe iT 2 
= a + F HP peri? (q? A y+ === Pe P; quis 2] (q? „A Y + 


(11.26) 
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where 


and 


(i.e. the 4x 4 unit matrix). It is easy to check (problem 10.5) that P?P? = PP. 
Hence the series (11.26) becomes 


—19uv —19up 2]/ 2 42 2] (2 A 2332 
pp Pe?) + OM? MY)? + 


—iguw = i 
e a Pel + (e A) + MO ey ae EP 


q q 
= Une = dud 142) _ ES du Qu (11 27) 
21-42) Pd 


after summing the geometric series, exactly as in (10.11)-(10.14). 

But we have forgotten the counter term of figure 11.1(6), which contributes 
an amplitude —i(g*"q? — q"q")(Z3 — 1). This has the effect of replacing ne 
in (11.27) by TI) — (Za — 1) and we arrive at the form 


—i(gpw = dud 12) _ i udv (11.28) 
PZA) d 
Now in any S-matrix element, at least one end of this corrected propagator 
will connect to an external charged particle line via a vertex of the form 
jk (p,p’) (cf (8.98) and (8.99) for example), as in figure 11.3. But, as we have 
seen in (8.100), current conservation implies 


quiz (p, p') = 0. (11.29) 


Hence the parts of (11.28) with ququ factors will not contribute to physical 
scattering amplitudes, and our O(e?) corrected photon propagator effectively 
takes the simple form 
=e 
SS ZI 
@? (Zs — TIP (q2, A?) 


We must now determine Z3 from the condition (just as for the C propagator) 
that (11.30) has the form —ig,,,/q? as q? — 0 (the mass-shell condition). This 
gives 


(11.30) 


ZP = 1 + II2l(0, A?) (11.31) 
the superscript on Z3 indicating as usual that it is an O(e?) calculation as 
evidenced by the e? factor in (11.18). We note from equation (11.25) that 


me? l (0, A2) contains a In A part, so that this time the field renormalization 
constant Z diverges when the cut-off is removed. 
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FIGURE 11.3 
One-loop corrected photon propagator connected to a charged particle vertex. 


Inserting (11.31) into (11.30) we obtain the final important expression for 
the y-propagator including the one-loop renormalized self-energy (cf (10.71)): 


211) 
where 
2/2 2,2 A2 2 2 
TPI (42) = 19 (q?, A?) — TPO, A2). (11.33) 
Equation (11.25) then leads to the result 
1 (q == fa (1-a) 11.34 
xx(l-— au) m|- —— (11.34) 


which was first given by Schwinger (1949a). This ‘once-subtracted’ me? | Gs 
finite as A — oo, and tends to zero as q? > 0. 
The generalization of (11.32) to all orders will be given by 


TH (11.35) 
PU — IL, (4?)) 
where IL, (q?) is the all-orders analogue of nó l in (11.32), and is similarly re- 
lated to the 1-y irreducible photon self-energy II, via the analogue of (11.24): 


Ly (°) = ¡(4 Iu = quq L(g"): (11.36) 


Because Mi and hence Tiy, has no 1~y intermediate states, it is expected to 
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FIGURE 11.4 
The contribution of a massless particle to the photon self-energy. 


have no contribution of the form A?/q?. If such a contribution were present, 
(11.35) shows that it would result in a photon propagator having the form 


ie 
T (11.37) 
which is, of course, that of a massive particle. Thus, provided no such con- 
tribution is present, the photon mass will remain zero through all radiative 
corrections. It is important to note, though, that gauge invariance is fully sat- 
isfied by the general form (11.36) relating II,» to Il}; it does not prevent the 
occurrence of such an ‘A?/q?’ piece in IL. Remarkably, therefore, it seems 
possible, after all, to have a massive photon while respecting gauge invari- 
ance! This loophole in the argument ‘gauge invariance implies my = 0’ was 
first pointed out by Schwinger (1962). 

Such a 1/q? contribution in IL must, of course, correspond to a mass- 
less single particle intermediate state, via a diagram of the form shown in 
figure 11.4. Thus if the theory contains a massless particle, not the photon 
(since 1-y states are omitted from II) but coupling to it, the photon can 
acquire mass. This is one way of understanding the ‘Higgs mechanism’ for 
generating a mass for a gauge-field quantum while still respecting the gauge 
symmetry (Englert and Brout 1964, Higgs 1964, Guralnik et al. 1964). The 
massless particle involved is called a ‘Goldstone boson’. As we shall see in 
volume 2, just such a photon mass is generated in a superconductor, and a 
similar mechanism is invoked in the Standard Model to give masses to the 
W* and ZO gauge bosons, which mediate the weak interactions. 


SSS 
11.5 The physics of ap (q?) 
We now consider some immediate physical consequences of the formulae (11.32) 


and (11.34). 


11.5.1 Modified Coulomb’s law 


In section 1.3.3 we saw how, in the static limit, a propagator of the form 
—gí la? + mg)! could be interpreted (via a Fourier transform) in terms of a 
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Yukawa potential 
—gă e—7/a 
4n r 


where a = my! (in units h = c = 1). As my > 0 we arrive at the Coulomb 
potential, associated with the propagator ~1/q? in the static (qo = 0) limit. 
It follows that the corrected propagator (11.32) must represent a correction 
to the 1/r Coulomb potential. 

To see what it is, we expand the denominator of (11.32) so as to write 
(11.32) as 


=y (1 +18 (4?) (11.38) 


which is in fact the perturbative O(a) correction to the propagator (we shall 
return to (11.32) in a moment). At low energies, and in the static limit, 
q? = —q? will be small compared to the fermion (mass)? in (11.34), and we 
may expand the logarithm in powers of q?/m?, with the result that the static 


propagator becomes (problem 11.6) 


Gua (1 2 q2 A 11 
ag ll (11.39) 
_ Y i a 1 
id ea (11.40) 


The Fourier transform of the first term in (11.40) is proportional to the familiar 
coulombic 1/r potential (see appendix G, for example), while the Fourier 
transform of the constant (q?-independent) second term is a 6-function: 


fear = (r). (11.41) 


When (11.40) is used in any scattering process between two charged particles, 
each charged particle vertex will carry a charge e (or —e) and so the total 
effective potential will be (in the attractive case) 


e Bown} (11.42) 


15m? 
The second term in (11.42) may be treated as a perturbation in hydrogenic 


atoms, taking m to be the electron mass. Application of first-order perturba- 
tion theory yields an energy shift 


AED = = 
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= ——~|%,(0)|*. (11.43) 
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Only s-state wavefunctions are non-vanishing at the origin, where they take 
the value (in hydrogen) 


%n(0) = e: eo (11.44) 


n 
where n is the principal quantum number. Hence for this case 


dam 


AE = — 
e 157n3 


(11.45) 


For example, in the 2s state the energy shift is —1.122 x 1077 eV. Although 
we did not discuss the Coulomb spectrum predicted by the Dirac equation 
in chapter 3, it turns out that the 281 and 2%P1 levels are degenerate if 
no radiative corrections (such as the previous one) are applied. In fact, the 
levels are found experimentally to be split apart by the famous “Lamb shift”, 
which amounts to AE/27h = 1058 MHz in frequency units. The shift we have 
calculated, for the 2s level, is —27.13 MHz in these units, so it is a small — but 
still perfectly measurable — contribution to the entire shift. This particular 
contribution was first calculated by Uehling (1935). 

While small in hydrogen and ordinary atoms, the ‘Uehling effect’ dom- 
inates the radiative corrections in muonic atoms, where the ‘m’ in (11.44) 
becomes the muon mass m,,. This means that the result (11.45) becomes 


da? My? 
= (=) Mu. 
157n3 \ m K 


Since the unperturbed energy levels are (in this case) proportional to my, 
this represents a relative enhancement of ~(m,/m)? ~ (210)?. This calcu- 
lation cannot be trusted in detail, however, as the muonic atom radius is 
itself ~1/210 times smaller than the electron radius in hydrogen, so that the 
approximation |q| ~ 1/r < m, which led to (11.42), is no longer accurate 
enough. Nevertheless the order of magnitude is correct. 


11.5.2  Radiatively induced charge form factor 


This leads us to consider (11.38) more generally, without making the low q? 
expansion. In chapter 8 we learned how the static Coulomb potential became 
modified by a form factor F(q?) if the scattering centre was not point-like, 
and we also saw how the idea could be extended to covariant form factors 
for spin-0 and spin-4 particles. Referring to the case of eu” scattering for 
definiteness (section 8.7), we may consider the effect of inserting (11.38) into 
(8.182). The result is 


pw _ 
Cte Yuuk (La + IE! (2) lip! WUp- (11.46) 
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Referring now to the discussion of form factors for charged spin-4 particles in 
section 8.8, we can share the correction (11.46) equally between the e~ and 
the y” vertices and write 


Clk pu — CU y Un (1 + aP (q?) 23 ez Ypur(1 + ae (q2)) (11.47) 


for the electron, and similarly for the muon. From (8.208) this means that our 
‘radiative correction’ has generated some effective extension of the charge, as 
given by a charge form factor Fi (q?) = 1+ in? (q2). Note that the condition 
Fı(0) = 1 is satisfied since ne (0) =0. 

In the static case, or for scattering of equal mass particles in the CM 
system, we have q? = —q? and we may consider the Fourier transform of 
the function F1(—q?), to obtain the charge distribution. The integral is dis- 
cussed in Weinberg (1995, section 10.2) and in Peskin and Schroeder (1995, 
section 7.5). The latter authors show that the approximate radial distribu- 
tion of charge is ~e~?”" /(mr)?/?, indicating that it has a range 3. This is 
precisely the mass of the fermion—antifermion intermediate state in the loop 
which yields nó l so this result represents a plausible qualitative extension of 
Yukawa’s relationship (1.20) to the case of two-particle exchange. In any case, 
the range represented by me l is of order of the fermion Compton wavelength 
1/m, which is an important insight; this is why we need to do better than the 
point-like approximation (11.42) in the case of muonic atoms. 


11.5.3 The running coupling constant 


There is yet another way of interpreting (11.38). Referring to (11.46), we may 
regard _ 
2(@) = ej + Te) (11.48) 


as a ‘q?-dependent effective charge’. In fact, it is usually written as a ‘q?- 
dependent fine structure constant’ 


a(q2) = afl + Ti?! (q2)]. (11.49) 


The concept of a q?-dependent charge may be startling but the related one of 
a spatially dependent charge is, in fact, familiar from the theory of dielectrics. 
Consider a test charge q in a polarizable dielectric medium, such as water. 
If we introduce another test charge —q into the medium, the electric field 
between the two test charges will line up the water molecules (which have a 
permanent electric dipole moment) as shown in figure 11.5. There will be an 
induced dipole moment P per unit volume, and the effect of P on the resultant 
field is (from elementary electrostatics) the same as that produced by a volume 
charge equal to — div P. If, as is usual, P is taken to be proportional to E, 
so that P = xeo E, Gauss’ law will be modified from 


div E = ptree/€0 (11.50) 
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FIGURE 11.5 
Screening of charge in a dipolar medium (from Aitchison 1985). 


to 
div E = (ptree — div P)/€0 = Ptree/€9 — div(xE) (11.51) 


where Pfree refers to the test charges introduced into the dielectric. If x is 
slowly varying as compared to E, it may be taken as approximately constant 
in (11.51), which may then be written as 


div E = Ptree/€ (11.52) 


where e = (1+ x)eo is the dielectric constant of the medium, €o being that of 
the vacuum. Thus the field is effectively reduced by the factor (1+x)7* = €0/e. 

This is all familiar ground. Note, however, that this treatment is essentially 
macroscopic, the molecules being replaced by a continuous distribution of 
charge density — div P. When the distance between the two test charges 
is as small as, roughly, the molecular diameter, this reduction — or screening 
effect — must cease and the field between them has the full unscreened value. 
In general, the electrostatic potential between two test charges qı and q2 ina 
dielectric can be represented phenomenologically by 


V(r) = qq2/4ne(r)r (11.53) 


where e(r) is assumed to vary slowly from the value e for r > d to the value €o 
for r < d, where d is the diameter of the polarized molecules. The situation 
may be described in terms of an effective charge 


d =a/le(r)'? (11.54) 


for each of the test charges. Thus we have an effective charge which depends 
on the interparticle separation, as shown in figure 11.6. 

Now consider the application of this idea to QED, replacing the polarizable 
medium by the vacuum. The important idea is that, in the vicinity of a test 
charge in vacuo, charged pairs can be created. Pairs of particles of mass m 


can exist for a time of the order of At ~ h/mc?. They can spread apart 
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FIGURE 11.6 
Effective (screened) charge versus separation between charges (from Aitchison 
1985). 


a distance of order cAt in this time, i.e. a distance of approximately h/mc, 
which is the Compton wavelength X.. This distance gives a measure of the 
‘molecular diameter’ we are talking about, since it is the polarized virtual 
pairs which now provide a vacuum screening effect around the original charged 
particle. The largest ‘diameter’ will be associated with the smallest mass m, 
in this case the electron mass. Not coincidentally, this estimate of the range 
of the ‘spreading’ of the charge ‘cloud’ is just what we found in section 11.5.2: 
namely, the fermion Compton wavelength. The longest-range part of the cloud 
will be that associated with the lightest charged fermion, the electron. 

In this analogy the bare vacuum (no virtual pairs) corresponds to the 
‘vacuum’ used in the previous macroscopic analysis and the physical vacuum 
(virtual pairs) to the polarizable dielectric. We cannot, of course, get outside 
the physical vacuum, so that we are really always dealing with effective charges 
that depend on r. What, then, do we mean by the familiar symbol e? This 
is simply the effective charge as r — oo or q? — 0; or, in practice, the charge 
relevant for distances much larger than the particles’ Compton wavelength. 
This is how our q? — 0 definition is to be understood. 

Let us consider, then, how a(q?) varies when q? moves to large space-like 
values, such that —q? is much greater than m? (i.e. to distances well within 
the ‘cloud’). For |q?| > m? we find (problem 11.7) from (11.34) that 


nee) = & mo (LDL) -$ + ot?) (11.55) 


~ 37 


so that our q?-dependent fine structure constant, to leading order in a is 


a(q?) za hegn (E) (11.56) 


for large values of |q?|/m?, where A = exp 5/3. 
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Equation (11.56) shows that the effective strength a(q?) tends to increase 
at large |q?| (short distances). This is, after all, physically reasonable: the 
reduction in the effective charge caused by the dielectric constant associated 
with the polarization of the vacuum disappears (the charge increases) as we 
pass inside some typical dipole length. In the present case, that length is m! 
(in our standard units h = c = 1), the fermion Compton wavelength, a typical 
distance over which the fluctuating pairs extend. 

The foregoing is the reason why this whole phenomenon is called vacuum 
polarization, and why the original diagram which gave me? l is called a vacuum 
polarization diagram. 

Equation (11.56) is the lowest-order correction to a, in a form valid for 
\q?| > m?. It turns out that, in this limit, the dominant vacuum polarization 
contributions (for a theory with one charged fermion) can be isolated in each 
order of perturbation theory and summed explicitly. The result of summing 
these ‘leading logarithms’ is 


a 


ARAS Oe aon) 


a(Q?) = 
where we now introduce Q? = —q?, a positive quantity when q is a momen- 
tum transfer. The justification for (11.57) — which of course amounts to the 
very plausible return to (11.32) instead of (11.38) — is subtle, and depends 
upon ideas grouped under the heading of the ‘renormalization group’. This 
is beyond the scope of the present volume, but will be taken up again in 
volume 2. 

Equation (11.57) presents some interesting features. First, note that for 
typical large Q? ~ (50 GeV)?, say, the change in the effective a predicted by 
(11.57) is quite measurable. Let us write 


a 


2) = 11.58 
a = oH (11.58) 
in general, where Aa(Q?) includes the contributions from all charged fermions 
with mass m such that m? < Q?. The contribution from the charged leptons 


is then straightforward, being given by 


Adteptons = = Y In(Q?/Am?) (11.59) 
l 


where my is the lepton mass. Including the e, j and 7 one finds (problem 11.8) 
Adeptons(Q? = (50 GeV)?) = 0.03. (11.60) 


However, the corresponding quark loop contributions are subject to strong 
interaction corrections, and are not straightforward to calculate. We shall not 
pursue this in detail here, noting just that the total contribution from the five 
quarks u, d, s, c and b has a value very similar to (11.60) for the leptons (see, 
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for example, Altarelli et al. 1989). Including both the leptonic and hadronic 
contributions then yields the estimate 


a(Q? = (50 GeV)2) = Hx oa. (11.61) 


The predicted increase of a(Q?) at large Q? has been tested by measuring 
the differential cross section for Bhabha scattering, 


eet +e et. (11.62) 


We are interested in the contribution from one-photon exchange in the t- 
channel, which will contain the factor a(Q?). To favour this contribution, 
the CM energy should be well beyond the Z° peak in the s-channel (cf figure 
9.16). This was the case at the highest LEP energy, y/s = 198 GeV, which also 
allowed large Q? values to be probed. The L3 experiment covered the region 
1800 GeV? < Q? < 21600 GeV? (Achard et al. 2005). These results, and 
earlier data from L3 (Acciari et al. 2000) and OPAL (Abbiendi et al. 2000), 
clearly show the expected rise in a(Q?) as Q? increases, and are in good 
quantitative agreement with the theoretical prediction of QED (Burkhardt 
and Pietrzyk 2001). 

The notion of a q?-dependent coupling constant is, in fact, quite general — 
for example, we could just as well interpret (10.71) in terms of a q?-dependent 
Ion (47). Such ‘varying constants’ are called running coupling constants. Until 
1973 it was generally believed that they would all behave in essentially the 
same way as (11.57) — namely, a logarithmic rise as Q? increases. Many people 
(in particular Landau 1955) noted that if equation (11.57) is taken at face value 
for arbitrarily large Q?, then a(Q2) itself will diverge at Q? = Am? exp(37/a). 
Taking m to be the mass of an electron, this is of course an absurdly high 
energy. Besides, as such energies are reached, approximations made in arriving 
at (11.57) will break down; all we can really say is that perturbation theory 
will fail as we approach such energies. 

While this may be an academic point in QED, it turns out that there is one 
part of the Standard Model where it may be relevant. This is the ‘Higgs sector’ 
involving a complex scalar field, as will be discussed in volume 2. In this case, 
the ‘running’ of the Higgs coupling constant can be invoked to suggest a useful 
upper bound on the Higgs mass (Maiani 1991). 

The significance of the 1973 date is that it was in that year that one 
of the most important discoveries in ‘post-QED’ quantum field theory was 
made, by Politzer (1973) and by Gross and Wilczek (1973). They performed 
a similar one-loop calculation in the more complicated case of QCD, which is 
a ‘non-Abelian gauge theory’ (as is the theory of the weak interactions in the 
electroweak theory). They found that the QCD analogue of (11.57) was 


a as (10?) 
(OO TRB MOT oe 


where f is the number of fermion-antifermion loops considered, and p is a 
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FIGURE 11.7 
Vacuum polarization insertion in the virtual one-photon annihilation ampli- 
tude in ete > uFp7. 


reference mass scale. The crucial difference from (11.57) is the large positive 
contribution ‘+33’, which is related to the contributions from the gluonic self- 
interactions (non-existent among photons). The quantity as(Q?) now tends 
to decrease at large Q? (provided f < 16), tending ultimately to zero. This 
property is called ‘asymptotic freedom’ and is highly relevant to understand- 
ing the success of the parton model of chapter 9, in which the quarks and 
gluons are taken to be essentially free at large values of Q?. This can be 
qualitatively understood in terms of as(Q?) > 0 for high momentum trans- 
fers (‘deep scattering’). The non-Abelian parts of the Standard Model will be 
considered in volume 2, where we shall return again to as(Q?). 


11.5.4 a?! in the s-channel 


We have still not exhausted the riches of me? l (q2). Hitherto we have con- 
centrated on regarding our corrected propagator as appearing in a t-channel 
exchange process, where q? < 0. But of course it could also perfectly well 
enter an s-channel process such as ete” — uu” (see problem 8.18), as 
in figure 11.7. In this case, the 4-momentum carried by the photon is q = 
Det + Pe- = Prt + Pp-> SO that q? is precisely the usual invariant variable 
‘s’ (cf section 6.3.3), which in turn is the square of the CM energy and is 
therefore positive. In fact, the process of figure 11.7 occurs physically only for 
q? = s > 4m2, where m, is the muon mass. 


Consider, therefore, our formula (11.34) for q? > 0, that is, in the time-like 
rather than the space-like (q? < 0) region. The crucial new point is that the 
argument [m? — q2x(1 — x)] of the logarithm can now become negative, so 
that me l must develop an imaginary part. The smallest q? for which this can 


happen will correspond to the largest possible value of the product x(1 — x), 


for 0 < x < 1. This value is $, and so no becomes imaginary for q? > 4m?, 


which is the threshold for real creation of an ete” pair. 


This is the first time that we have encountered an imaginary part in a 
Feynman amplitude which, for figure 11.7 and omitting all the spinor factors, 
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is once again 
1 
E (11.64) 
PU — Il (47) 
but now q? > 4m3, which is greater than 4m? so that 11?! (42) in (11.64) has 
an imaginary part. There is a good physical reason for this, which has to do 
with unitarity. This was introduced in section 6.2.2 in terms of the relation 
SSi = I for the S-matrix. The invariant amplitude M is related to S by 
Sa =1+1(27)9*(p, — pr) Ma (cf (6.102)). Inserting this into SS! = I leads 
to an equation of the form (for help see Peskin and Schroeder (1995, section 
7.3) 


2ImMg = 5 Mi Mai (2r) 8 (0 — 5 a) (11.65) 
k 


where ‘$, stands for the phase space integral involving momenta q1,q2;..- 
over the states allowed by energy-momentum conservation. This implies that 
as the energy crosses each threshold for production of a newly allowed state, 
there will be a new contribution to the imaginary part of M. This is exactly 
what we are seeing here, at the ete” threshold. 

It is interesting, incidentally, that (11.65) can be used to derive the rela- 
tivistic generalization of the optical theorem given in appendix H (note that 
the right-hand side of (11.65) is clearly related to the total cross section for 
isk, ifi=f). 

As regards the real part of me l (q?) in the time-like region, it will be given 
by (11.57) with Q? replaced by q?, or s, for large values of q?. Again, mea- 
surements have verified the predicted variation of a(q?) in the time-like region 
(Miyabayashi et al. 1995, Ackerstaff et al. 1998, Abbiendi et al. 1999, 2000). 

There is one more ‘elementary’ loop that we must analyse — the vertex 
correction shown in figure 11.8, which we now discuss. We will see how the 
important relation Z1 = Z2 emerges, and introduce some of the physics con- 
tained in the renormalized vertex. 


11.6 The O(e?) vertex correction, and Zi = Z2 


The amplitude corresponding to figure 11.8 is 


—ieu(p' T}! (p, p')u(p) = a) | cier > 
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FIGURE 11.8 
One-loop vertex correction. 


where Yu = 9uo y”, and ra represents the correction to the standard vertex 


and again € = 1. We find 


1 1 1 dtk 
TP». p’) = —i fs A A O eee 11. 
17) (p, p°) le sa y k m “y K a (2m)1 ( 67) 


The integral is logarithmically divergent at large k, by power counting, and 
the divergence will be cancelled by the Zı counter term of figure 11.1(c). It 
turns out to be infrared divergent also, as was db/?! /dp. As in the latter 
case, we leave the infrared problem aside, concentrating on the removal of 
ultraviolet divergences. 

Zi is determined by the requirement that the total amplitude at q = 
p—p' = 0, for on-shell fermions, is just —ieu(p)y,u(p), this being our definition 
of “e”. Hence we have (at O(e?)) 


—ieū(p) P (p, pJu(p) — ieūlp)y„ (Z? — 1)u(p) = 0 (11.68) 


and so 
2 
TB (p, p) + (Zi — 1) = 0. (11.69) 


The renormalized vertex correction re? may then be defined as 


DB (p, p) = TP (p, p') + (ZP! — yy, =TP (pp) TP (p,p) (11.70) 


and in this ‘once-subtracted’ form it is finite, and equal to zero at q = 0. 

We shall consider some physical consequences of rP in a moment, but 
first we show that (at O(e2)) zP = zP, and explain the significance of this 
important relation. It is, after all, at first sight a rather surprising equality 
between two apparently unrelated quantities, one associated with the fermion 
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self-energy, the other with the vertex part. From (11.9) we have, for the 
fermion self-energy, 


1 1 d*k 
SEI (p) = —ie? J a Sn (11.71) 


One can discern some kind of similarity between (11.71) and (11.67), which 
can be elucidated with the help of a little algebra. 

Consider differentiating the identity (p — m)(p—m)”* = 1 with respect to 
p“: 


o = E R > 


Ope 
9 =i 9 ai 
= [gate m] om = mate 
- o _ 
= lpm) "+ (pm) (Pm) 2 (11.72) 
It follows that 
ð ri — = 
dpe | = m) =- (pm) lpm)" (11.73) 
from which the Ward identity (Ward 1950) follows immediately: 


2 
O pe 
Opt hp 


zi (p, p' = p). (11.74) 
Derived here to one-loop order, the identity is, in fact, true to all orders, pro- 
vided that a gauge-invariant regularization is adopted. Note that the identity 
deals with rl at zero momentum transfer (q = p — p’ = 0), which is the 
value at which e is defined. Note also that consistently with (11.74), each of 
onl / Op and re? are both infrared and ultraviolet divergent, though we shall 
only be concerned with the latter. 

The quantities XÊ! and TP! are both O(e2), and contain ultraviolet di- 
vergences which are cancelled by the O(e2) counter terms. From (11.11) and 
(11.12) we have 


SE) = SE! — ZP (mo — m) + (p—m)(ZP! — 1) (11.75) 

where ÈP! is finite, and from (11.70) we have 
ray) =TE@,p') — (ZP! — dy, (11.76) 
where DA is finite. Inserting (11.75) and (11.76) into (11.74) and equating 


the infinite parts gives 
zP = zł, (11.77) 
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This relation is true to all orders (Zi = Z2), provided a gauge-invariant 
regularization is used. It is a very significant relation, as already indicated 
after (11.8). It shows, first, that the gauge principle survives renormalization 
provided the regularization is gauge invariant. More physically, it tells us that 
the bare and renormalized charges are related simply by (cf (11.6)) 


e = e23”. (11.78) 


In other words, the interaction-dependent rescaling of the bare charge is due 
solely to vacuum polarization effects in the photon propagator, which are 
the same for all charged particles interacting with the photon. By contrast, 
both Zı and Za do depend on the specific type of the interacting charged 
particle, since these quantities involve the particle masses. The ratio of bare 
to renormalized charge is independent of particle type. Hence if a set of bare 
charges are all equal (or ‘universal’), the renormalized ones will be too. But 
we saw in section 2.6 how just such a notion of universality was present in 
theories constructed according to the (electromagnetic) gauge principle. We 
now see how the universality survives renormalization. In volume 2 we shall 
find that a similar universality holds, empirically, in the case of the weak 
interaction, giving a strong indication that this force too should be described 
by a renormalizable gauge theory. 


E ooo ——— ooo —————— 


11.7 The anomalous magnetic moment and tests of QED 


Returning now to ra, just as in section 11.5.2 we regarded the vacuum po- 


larization correction 1 + in? l as a contribution to the fermion's charge form 
factor F (q?), so we may expect that the vertex correction will also contribute 
to the form coa Indeed, let us recall the general form of the electromagnetic 
vertex for a spin-3 particle (cf (8.208)): 


2 


F: 
—ieū(p', s") Fl TEET u(p, s) (11.79) 


2m 


where k is the ‘anomalous’ part of the magnetic moment, i.e. the magnetic 
moment is (eh/2m)(1 + Kx), the ‘1’ being the Dirac value calculated in sec- 
tion 3.5. In (11.79), Fi and Fy are each normalized to 1 at q? = 0. Our 


vertex re l contributes to both the ml and a magnetic moment form 
factors; let us call the contributions FR and KF}. Now the Z1 counter term 
multiplies y,,, and therefore clearly cancels a divergence in F; 21 Ts there also, 
we may ask, a ÓN in 44) 22 


Actually, KF}! is convergent, and this is highly significant to the physics of 
komali. Had it been divergent, we would either have had to abandon 
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FIGURE 11.9 
Contribution (which is finite) to yy > yy. 


the theory or introduce a new counter term to cancel the divergence. This 
counter term would have the general form 


K 3 ra 
eon DE (11.80) 


it is, indeed, an ‘anomalous magnetic moment’ interaction. But no such term 
exists in the original QED Lagrangian (11.1)! Its appearance does not seem 
to follow from the gauge principle argument, even though it is, in fact, gauge 
invariant. Part of the meaning of the renormalizability of QED (or any the- 
ory) is that all infinities can be cancelled by counter terms of the same form as 
the terms appearing in the original Lagrangian. This means, in other words, 
that all infinities can be cancelled by assuming an appropriate cut-off depen- 
dence for the fields and parameters in the bare Lagrangian. The interaction 
(11.80) is certainly gauge invariant — but it is non-renormalizable — as we 
shall discuss further later. The message is that, in a renormalizable theory, 
amplitudes which do not have counterparts in the interactions present in the 
bare Lagrangian must be finite. Figure 11.9 shows another example of an 
amplitude which turns out to be finite: there is no ‘At type of interaction in 
QED (cf figure 10.13 (a) and the attendant comment in section 10.5). 

The calculation of the renormalized F(q?) and of KF2(q?) is quite labo- 
rious, not least because three denominators are involved in the ro integral 
(11.67). The dedicated reader can follow the story in section 6.3 of Peskin and 
Schroeder (1995). The most important result is the value obtained for k, the 
QED-induced anomalous magnetic moment of the fermion, first calculated by 
Schwinger (1948a). He obtained 


k= 20.001 1614 (11.81) 
27 
which means a g-factor corrected from the g = 2 Dirac value to 


g=24— (11.82) 
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or, equivalently, 
[(g — 2)/2]schwinger = = = 0.0011614. (11.83) 
T 


Note that since « is a dimensionless quantity, it cannot depend on the mass m 
of the internal fermion in (11.66). Contributions from two-loop (and higher) 
diagrams can involve different leptons in internal lines, and hence can depend 
on lepton mass ratios. 

The prediction (11.83) may be compared with the experimental values 
which are, for the electron (Hanneke et al. 2008) 


Ge,expt = [(Ge —2)/2lexpt = 115 965 218 0.73 (0.28) x 10-12 [0.24 ppb] (11.84) 
and for the muon (Bennett et al. 2006) 
ap expt = [(Gu — 2)/2lexpt = 116 592 080 (63) x 10-11 [0.54 ppm], (11.85) 


where the bracketed figures are the quoted uncertainties (statistical and sys- 
tematic combined in quadrature). Of course, in Schwinger’s day the exper- 
imental accuracy was far different, but there was a real discrepancy (Kusch 
and Foley 1947) with the Dirac value (a = 0). Schwinger’s one-loop calcu- 
lation provided a fundamental early confirmation of QED, and was the start 
of a long confrontation between theory and experiment which still continues. 
The interested reader is referred to the extensive review by Jegerlehner and 
Nyffeler (2009), upon which we shall draw in the following. 

The extraordinarily precise values in (11.84) and (11.85) represent the 
result of ever more sophisticated and imaginative experimentation. The mea- 
surement Of Ge expt is some 2250 times more accurate than that of ay exp. Yet 
the latter is capable of probing the Standard Model more deeply, for an inter- 
esting reason. Consider expanding the vacuum polarization formula (11.18) 
in powers of m/A, having done the momentum integrals as in (10.51) and 
removed the In A divergence by the subtraction (11.33). The resulting expres- 
sion will be finite as A — oo, but for finite A it will contain A-dependent 
terms, the first being of order (m?/A?). This suggests that the contribution 
of a ‘beyond QED physics’ scale to 4, theory (modelled crudely by our cut-off) 
would be enhanced by a factor (m,/me)”? = 43000 relative to its contribu- 
tion to do This outweighs by a factor of 19 the greater experimental 
accuracy iN Ge exp- 

This is both good news and bad news. We may distinguish three distinct 
contributions to ‘beyond QED physics’ in de theory and My theory: (i) SM weak 
interactions; (ii) SM strong (or hadronic) interactions; (iii) beyond the SM 
physics. Representative diagrams contributing to (i) and (ii) are shown in 
figure 11.10 (a) and (b) respectively. Sensitivity of ae theory to effects under (i) 
is welcome, since they are calculable, and in principle may provide precision 


1The sensitivity would be even greater for a; of course, but the very short lifetime of 
the 7 precludes an accurate measurement of its magnetic moment, at present. 
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hadronic 


FIGURE 11.10 
‘Beyond QED’ contributions to Az theory (L = e, u) due to (a) weak and (b) 
strong interaction corrections. 


tests of the theory. Effects under (ii), however, are difficult to control, and 
may limit the precision of the theoretical prediction — and hence the capacity 
to discern the appearance of ‘beyond the SM physics’. 

In the case of Ge,theory, it turns out that the sensitivity to effects under (i) 
and (ii) is very small. This allows for an essentially pure QED high precision 
prediction of ae. The accuracy of the experimental number requires calculation 
of QED corrections up to 8th order — i.e. terms proportional to (a/7)*, which 
contain 4 loops; there are 891 such diagrams. Their contribution has been 
calculated by numerical methods by Kinoshita and collaborators (Aoyama et 
al. 2007, 2008; Kinoshita and Nio 2006), who have also estimated the 10th 
order (5-loop) contributions. To compare with experiment, a value of the fine 
structure constant a is required. The most accurate value currently quoted is 
(Bouchendira et al. 2011) 


a! = 137.035 999 037 (91) [0.66 ppb]. (11.86) 
With this a the theoretical (QED) prediction of ae is 


aQED = 115 965 218 1.13 (0.11) (0.37) (0.77) x 10-12 (11.87) 


e, theory 


where the first, second, and third uncertainties come from the calculated 8th 
order terms, the 10th order estimate, and the fine structure constant (11.86). 
The theory is thus in good agreement with experiment, at an extraordinary 
level of precision: 


desexpt — theory = —0-40 (0.88) x 1012. (11.88) 


The QED part of the Standard Model is indeed the paradigm quantum field 
theory. Further progress will depend on the evaluation of the 10th order 
(5-loop) terms. 
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Turning now to 4, theory, the “pure QED’ part has been evaluated up to 
4 loops and estimated at the 5-loop level, with the result (Jegerlehner and 
Nyffeler 2009) 


aQED ny = 116 584 718.1 (0.2) x 10-11 (11.89) 


p,theory 


where the error results from the uncertainties in the lepton mass ratios, the 
numerical error in the a* terms, the estimated uncertainty in the a? terms, 
and the uncertainty in the value of a, which in (11.89) is determined from 


Ue expt- There are also electroweak and hadronic contributions, Oa cane and 
had. 


Gi theory: Lhe first of these has been evaluated up to 2 loops, and the 3-loop 
effects are negligible; the result is (Jegerlehner and Nyffeler 2009) 

an peos = 153-2 (1:8) 0: (11.90) 
hex is considerably larger, and has larger uncertainties. Its value is the 
subject of intensive ongoing theoretical effort, and is likely to be regularly 
updated. Here we give the value arrived at by Jegerlehner and Nyffeler (2009), 
namely 

ara... = 6918.8 (65) x 1071. (11.91) 


theory 


Adding together (11.89), (11.90) and (11.91) gives the Standard Model pre- 
diction 
= 116 591 790.1 (65) x 107". (11.92) 


SM 
Cu theory 


It is worth stressing that all of the Standard Model (electromagnetic, weak 
and strong theories) is needed for the result (11.92); it is also interesting that 
the theoretical error is essentially the same as the experimental one, at this 
stage. 

Comparison of (11.92) and (11.85) yields 


usc 00 ana = 290 (90) 10, (11.93) 


Equation (11.93) represents a discrepancy of some 3 standard deviations. This 
discrepancy between experiment and the SM prediction has persisted now for 
a number of years, and is one of the very few significant (at this level) such 
discrepancies. While it may be premature to conclude that a, can definitely 
not be understood without some ‘beyond the SM’ physics, many such possi- 
bilities are reviewed by Jegerlehner and Nyffeler (2009). No doubt this epic 
confrontation between theory and experiment will continue to be pursued: it 
is a classic example of the way in which a very high-precision measurement 
in a thoroughly ‘low-energy’ area of physics (a magnetic moment) can have 
profound impact on the ‘high-energy’ frontier — a circumstance we may be 
increasingly dependent upon. 

One conclusion we can certainly draw is that renormalizable quantum field 
theories are the most predictive theories we have. We end this volume with 
some general reflections on renormalizable, and non-renormalizable, theories. 
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E a 


11.8 Which theories are renormalizable — and does it 
matter? 


In the course of our travels thus far, we have met theories which exhibit 
three different types of ultraviolet behaviour. In the ABC theory at one-loop 
order, we found that both the field strength renormalizations and the vertex 
correction were finite; only the mass shifts diverged as A — oo. The theory was 
called ‘super-renormalizable’. In QED, we needed divergent renormalization 
constants Z; as well as an infinite mass shift — but (although we did not 
attempt to explain why) these counter terms were enough to cure divergences 
systematically to all orders and the theory was renormalizable. Finally, we 
asserted that the anomalous coupling (11.80) was non-renormalizable. In the 
final section of this volume we shall try to shed more light on these distinctions 
and their significance. 

Is there some way of telling which of these ultraviolet behaviours a given 
Lagrangian is going to exhibit, without going through the calculations? The 
answer is yes (nearly), and the test is surprisingly simple. It has to do with the 
dimensionality of a theory’s coupling constant. We have seen (section 6.3.1) 
that the dimensionality of ‘g’ in the ABC theory is M! (using mass as the 
remaining dimension when h = c = 1), that of e in QED is MO (section 7.4) 
and that of the coefficient of the anomalous coupling Pow HY in (11.80) 
is M-1. These couplings have positive, zero and negative mass dimension, 
respectively. It is no accident that the three theories, with different dimensions 
for their couplings, have different ultraviolet behaviour and hence different 
renormalizability. 

That coupling constant dimensionality and ultraviolet behaviour are re- 
lated can be understood by simple dimensional considerations. Compare, for 
example, the vertex corrections in the ABC theory (figure 10.6) and in QED 
(figure 11.8). These amplitudes behave essentially as 


dtk 
and 
4 
k 
rel ~e f — (11.95) 


respectively, for large k. Both are dimensionless: but in (11.94) the positive 
(mass)? dimension of gs, is compensated by two additional factors of k? in 
the denominator of the integral, as compared with (11.95), with the result 
that (11.94) is ultraviolet convergent but (11.95) is not. The analysis can be 
extended to higher-order diagrams: for the ABC theory, the more powers of 
gph Which are involved, the more denominator factors are necessary, and hence 
the better the convergence is. Indeed, in this kind of ‘super-renormalizable’ 
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theory, only a finite number of diagrams are ultraviolet divergent, to all orders 
in perturbation theory. 

It is clear that some kind of opposite situation must obtain when the 
coupling constant dimensionality is negative; for then, as the order of the per- 
turbation theory increases, the negative powers of M in the coupling constant 
factors must be compensated by positive powers of k in the numerators of 
loop integrals. Hence the divergence will tend to get worse at each successive 
order. A famous example of such a theory is Fermi’s original theory of P-decay 
(Fermi 1934a, b), referred to in section 1.3.5, in which the interaction density 
has the ‘four-fermion’ form 


Grý (x) falej (x), (2) (11.96) 


where Gp is the ‘Fermi constant’. To find the dimensionality of Gr, we first 
establish that of the fermion field by considering a mass term mio), for exam- 
ple. The integral of this over d?a gives one term in the Hamiltonian, which has 
dimension M. We deduce that [4] = 3, since [da] = —3. Hence lib] = 6, 
and so [Gp] = —2. The coupling constant Gp in (11.96) therefore has a neg- 
ative mass dimension, just like the coefficient K/m in (11.80). Indeed, the 
four-fermion theory is also non-renormalizable. 

Must such a theory be rejected? Let us briefly sketch the consequences of 
an interaction of the form (11.96), but slightly simpler, namely 


Grý lajn (2) (br. (a) (11.97) 


where, for the present purposes, the neutron is regarded as point-like. Con- 
sider, for example, the scattering process Ve +n —> Ve +n. To lowest order 
in Gr, this is given by the tree diagram — or ‘contact term’ — of figure 11.11, 
which contributes a constant —iGp to the invariant amplitude for the process, 
disregarding the spinor factors for the moment. A one-loop O(G%) correction 
is shown in figure 11.12. Inspection of figure 11.12 shows that this is an s- 
channel process (recall section 6.3.3): let us call the amplitude ~iGpG?|(s), 
where one Gp factor has been extracted, so that the correction can be com- 
pared with the tree amplitude and GPl(s) is dimensionless. Then G?l(s) is 
given by 


GPl(s) =G E (11.98) 
i (27)4 k = Mu, (pu, + Pn a k) — Mn 


As expected, the negative mass dimension of Gp leaves fewer k-factors in the 
denominator of the loop integral. Indeed, manipulations exactly like those 
we used in the case of XP] shows that G?l(s) has a quadratic divergence, 
and that dG?! /ds has a logarithmic divergence. The extra denominators 


associated with second and higher derivatives of GP (s) are sufficient to make 
these integrals finite. 
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FIGURE 11.11 
Lowest order contribution to Ve +n —> Ve +n in the model defined by the 
interaction (11.97). 


FIGURE 11.12 
Second-order (one-loop) contribution to ve +n > ve +n. 


The standard procedure would now be to cancel these divergences with 
counter terms. There will certainly be one counter term arising naturally 
from writing the bare version of (11.97) as (cf (11.5)): 


Gor Yon tonon Dor. = Grýn tay, Ún. + (Za = DG. bad, br. (11,99) 


where Z4Gp = Gor ZonZ2,,, and the Z>'s are the field strength renormaliza- 
tion constants for the n and ve fields. Including the tree graph of figure 11.11, 
the amplitude of figure 11.12, and the counter term, the total amplitude to 
O(G2) is given by 


iM =iGp—iGsG) (a) — iGr (Z4 = 1). (11.100) 


As in our earlier examples, Z4 will be determined from a renormalization 
condition. In this case, we might demand, for example, that the amplitude 
M reduces to Gp at the threshold value s = so, where sy = (ma + m,, )?. 
Then to O(G2.) we find 


ZE =1- GPl(s0) (11.101) 
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and our amplitude (11.100) is, in fact, 
On Dra all 
¡Gp — iGr [G7 (s) — G; (so). (11.102) 


In (11.102), we see the familiar outcome of such renormalization — the 
appearance of subtractions of the divergent amplitude (cf (10.74), (11.11), 
(11.33) and (11.70)). In fact, because AGP! /ds is also divergent, we need a 
second subtraction — and correspondingly, a new counter term, not present in 
the original Lagrangian, of the form 


Gat, Sindy, hin, 


for example; there will also be others, but we are concerned only with the gen- 
eral idea. The occurrence of such a new counter term is characteristic of a non- 
renormalizable theory, but at this stage of the proceedings the only penalty 
we pay is the need to import another constant from experiment, namely the 
value D of aGP!/ds at some fixed s, say s = so; D will be related to the 
renormalized value of Gg. We will then write our renormalized amplitude, up 
to 0(G7), as 

¡Gp[1 + D(s — so) + GP (s)] (11.103) 


where GPl(s) is finite, and vanishes along with its first derivative at s = so; 
that is, GPl(s) contributes calculable terms of order (s — sp)? if expanded 
about s = So. 

The moral of the story so far, then, is that we can perform a one-loop 
renormalization of this theory, at the cost of taking additional parameters 
from experiments and introducing new terms in the Lagrangian. What about 
the next order? Figure 11.13 shows a two-loop diagram in our theory, which is 
of order G}. Writing the amplitude as Bee (s), the ultraviolet behaviour 


dtkyd4k 
(iar)? [AS (11.104) 


where k is a linear function of kı and k2. This has a leading ultraviolet 


of G(s) is given by 


divergence ~ Af, even worse than that of GP. As suggested earlier, it is 
indeed the case that, the higher we go in perturbation theory in this model, 
the worse the divergences become. We can, of course, eliminate this divergence 
in GP! by performing a further subtraction, requiring the provision of more 
parameters from experiment. By now the pattern should be becoming clear: 
new counter terms will have to be introduced at each order of perturbation 
theory, and ultimately we shall need an infinite number of them, and hence 
an infinite number of parameters determined from experiment — and we shall 
have zero predictive capacity. 

Does this imply that the theory is useless? We have learned that GP! (s) 
produces a calculable term of order G2 (s — so)? when expanded about s = so; 
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FIGURE 11.13 
A two-loop contribution to Ve +n — Ve +n in the model defined by (11.97). 


and that GP! will produce a calculable term of order G} (s — so)?, and so on. 
Now, from the discussion after (11.96), Gp itself is a dimensionless number di- 
vided by the square of some mass. As we saw in section 1.3.5 (and will return 
to in more detail in volume 2), in the case of the physical weak interaction 
this mass in Gp is the W-mass, and Gp ~ a Mă. Hence our loop corrections 
have the form a?(s — s9)?/M¥,, a? (s — s0)*/M$,.... We now see that for low 
enough energy close to threshold, where (s — so) < My, it will be a good 
approximation to stop at the one-loop level. As we go up in energy, we will 
need to include higher-order loops, and correspondingly more parameters will 
have to be drawn from experiment. But only when we begin to approach an 
energy ys ~ Mw/ya - Ga ~ 300 GeV will this theory be terminally sick. 
This was pointed out by Heisenberg (1939). For this argument to work, it is 
important that the ultraviolet divergences at a given order in perturbation 
theory (i.e. a given number of loops) should have been removed by renormal- 
ization, otherwise factors of A? will enter — in place of the (s — so) factors, for 
example. 

We have seen that a non-renormalizable theory can be useful at energies 
well below the ‘natural’ scale specified by its coupling constant. Let us look at 
this in a slightly different way, by considering the two four-fermion interaction 
terms introduced at one loop, 


Gr aý, ns and — Gath Adar), bdr. (11.105) 


We know that Gr ~ ME, and similarly Gg ~ My (from dimensional count- 
ing, or from the association of the Ga term with the O(G3) counter term). 
From dimensional analysis, or by referring to (11.103) and remembering that 
Dis of order Gp for consistency, we see that the second term in (11.105), when 
evaluated at tree level, is of order (s — sp)/My times the first. It follows that 
higher derivative interactions, and in general terms with successively larger 
negative mass dimension, are increasingly suppressed at low energies. 
Where, then, do renormalizable theories fit into this? Those with cou- 
plings having positive mass dimension (‘super-renormalizable’) have, as we 
have seen, a limited number of infinities and can be quickly renormalized. 
The ‘merely renormalizable’ theories have dimensionless coupling constants, 
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such as e (or a). In this case, since there are no mass factors (for good or ill) 
to be associated with powers of a, as we go up in order of perturbation theory 
it would seem plausible that the divergences get essentially no worse, and can 
be cured by the counter terms which compensated those simplest divergences 
which we examined in earlier sections — though for QED the proof is difficult, 
and took many years to perfect. 

Given any renormalizable theory, such as QED, it is always possible to 
suppose that the ‘true’ theory contains additional non-renormalizable terms, 
provided their mass scale is very much larger than the energy scale at which 
the theory has been tested. For example, a term of the form (11.80) with 
‘K/m’ replaced by some very large inverse mass M~! would be possible, and 
would contribute an amount of order 4e/M to a lepton magnetic moment. 
The present level of agreement between theory and experiment in the case of 
the electron’s moment implies that M > 4 x 10? GeV. 

From this perspective, then, it may be less of a mystery why renormal- 
izable theories are generally the relevant ones at presently posed energies. 
Returning to the line of thought introduced in section 10.1.1, we may imag- 
ine that a ‘true’ theory exists at some enormously high energy A (the Planck 
scale?) which, though not itself a local quantum field theory, can be written 
in terms of all possible fields and their couplings, as allowed by certain sym- 
metry principles. Our particular renormalizable subset of these theories then 
emerges as a low-energy effective theory, due to the strong suppression of the 
non-renormalizable terms. Of course, for this point of view to hold, we must 
assume that the latter interactions do not have ‘unnaturally large’ couplings, 
when expressed in terms of A. 

This interpretation, if correct, deals rather neatly with what was, for many 
physicists, an awkward aspect of renormalizable theories. On the one hand, 
it was certainly an achievement to have rendered all perturbative calculations 
finite as the cut-off went to infinity; but on the other, it was surely unreason- 
able to expect any such theory, established by confrontation with experiments 
in currently accessible energy regimes, really to describe physics at arbitrarily 
high energies. On the ‘low-energy effective field theory’ interpretation, we can 
enjoy the calculational advantages of renormalizable field theories, while ac- 
knowledging — with no contradiction — the likelihood that at some scale ‘new 
physics’ will enter. 

Having thus argued that renormalizable theories emerge ‘naturally’ as low- 
energy theories, we now seem to be faced with another puzzle: why were weak 
interactions successfully describable, for many years, in terms of the non- 
renormalizable four-fermion theory? The answer is that non-renormalizable 
theories may be physically detectable at low energies if they contribute to 
processes that would otherwise be forbidden. For example, the fact that (as 
far as we know) neutrinos have neither electromagnetic nor strong interactions, 
but only weak interactions, allowed the four-fermion theory to be detected — 
but amplitudes were suppressed by powers of s/M{¥, (relative to comparable 
electromagnetic ones) and this was, indeed, why it was called ‘weak’! 
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FIGURE 11.14 
One-Z (Yukawa-type) exchange process in ve +n —> Ve +n. 


In the case of the weak interaction, the reader may perhaps wonder why — if 
it was understood that the four-fermion theory could after all be handled up to 
energies of order 10 GeV — so much effort went in to creating a renormalizable 
theory of weak interactions, as it undoubtedly did. Part of the answer is that 
the utility of non-renormalizable interactions was a rather late realization (see, 
for example, Weinberg 1979). But surely the prospect of having a theory with 
the predictive power of QED was a determining factor. At all events, the 
preceding argument for the ‘naturalness’ of renormalizable theories as low- 
energy effective theories provides strong expectation that such a description 
of weak interactions should exist. 

We shall discuss the construction of the currently accepted renormalizable 
theory of electroweak interactions in volume 2. We can already anticipate 
that the first step will be to replace the ‘negative-mass-dimensioned’ constant 
Gr by a dimensionless one. The most obvious way to do this is to envisage 
a Yukawa-type theory of weak interactions mediated by a massive quantum 
(as, of course, Yukawa himself did — see section 1.3.5). The four-fermion 
process of figure 11.11 would then be replaced by that of figure 11.14, with 
amplitude (omitting spinors) ~ 92/(q? — m2) where gz is dimensionless. For 
small q? << m2, this reduces to the contact four-fermion form of figure 11.11, 
with an effective Gp ~ g2/m2, showing the origin of the negative mass di- 
mensions of Gp. It is clear that even if the new theory were to be renor- 
malizable, many low-energy processes would be well described by an effective 
non-renormalizable four-fermion theory, as was indeed the case historically. 

Unfortunately, we shall see in volume 2 that the application of this simple 
idea to the charge-changing weak interactions does not, after all, lead to a 
renormalizable theory. This teaches us an important lesson: a dimensionless 
coupling does not necessarily guarantee renormalizability. 
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To arrive at a renormalizable theory of the weak interactions it seems to be 
necessary to describe them in terms of a gauge theory (recall the ‘universality’ 
hints mentioned in section 11.6). Yet the mediating gauge field quanta have 
mass, which appears to contradict gauge invariance. The remarkable story of 
how gauge field quanta can acquire mass while preserving gauge invariance is 
reserved for volume 2. 

A number of other non-renormalizable interactions are worth mentioning. 
Perhaps the most famous of all is gravity, characterized by Newton’s constant 
Gn, which has the value (1.2 x 101% GeV) 2. The detection of gravity at ener- 
gies so far below 10!9 GeV is due, of course, to the fact that the gravitational 
fields of all the particles in a macroscopic piece of matter add up coherently. 
At the level of the individual particles, its effect is still entirely negligible. 
Another example may be provided by baryon and/or lepton violating interac- 
tions, mediated by highly suppressed non-renormalizable terms.? Such things 
are frequently found when the low-energy limit is taken of theories defined 
(for example) at energies of order 1016 GeV or higher. 

The stage is now set for the discussion, in volume 2, of the renormalizable 
non-Abelian gauge field theories which describe the weak and strong sectors 
of the Standard Model. 


-e 
Problems 
11.1 Establish the values of the counter terms given in (11.12). 


11.2 Convince yourself of the rule ‘each closed fermion loop carries an addi- 
tional factor —1’. 


11.3 Explain why the trace is taken in (11.14). 
11.4 Verify (11.15). 


11.5 Verify the quoted relation PPP? = P? where P? = g? — qPq,/q? (cf 
(11.26)). 


11.6 Verify (11.39 ) for q? < m?. 
11.7 Verify (11.55 ) for —q? > m?. 
11.8 Check the estimate (11.60). 


11.9 Find the dimensionality of “E” in an interaction of the form E(F,,F#”)?. 
Express this interaction in terms of the E and B fields. Is such a term finite 
or infinite in QED? How might it be measured? 


2The most general renormalizable Lagrangian with the field content, and the gauge 
symmetries, of the Standard Model automatically conserves baryon and lepton number 
(Weinberg 1996, pp 316-7). 
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Non-relativistic Quantum Mechanics 


This appendix is intended as a very terse ‘revision’ summary of those aspects 
of non-relativistic quantum mechanics that are particularly relevant for this 
book. A fuller account may be found in Mandl (1992), for example. 

Natural units îi = c = 1 (see appendix B). 

Fundamental postulate of quantum mechanics: 


[P;, 23] = 105. (A.1) 

Coordinate representation: 
p=-iV (A.2) 
Hy(a,t) = ¡ue (A.3) 


Schródinger equation for a spinless particle: 


-2 
H = Sai +V (A.4) 
and so i E 
2 Y _: T, 
Probability density and current (see problem 3.1 (a)): 
p=w*"Y = 4? >0 (A.6) 
j = pal 00) = (Ver) (A.7) 
with ə 
p lis 
OL +V J =U. (A.8) 
Free-particle solutions: 
o(a,t) = u(w)e“™ (A.9) 
Hou = Eu (A.10) 
where _ a 
o. = H(V =0). (A.11) 
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Box normalization: 
f u*(x)u(x) dix = 1. (A.12) 
V 


Angular momentum: Three Hermitian operators (Ses Los de satisfying 
[Jey dy] = ih 


and corresponding relations obtained by rotating the x-y-z subscripts. The 
aD A 
result [J , Jz] = 0 implies complete sets of states exist with definite values of 


J and J.. Eigenvalues of Í are (with A = 1) j + 1) where j=0,3,1,...; 
eigenvalues of da are m where —j < m < j, for given j. For orbital angular 
momentum, J > L = r x p and eigenfunctions are spherical harmonics 
Yim (0, 6), for which eigenvalues of i and Ê, are l (l+ 1) and m where —1 < 
m <l. For spin-4 angular momentum, J — lo where the Pauli matrices 


2 
O = (Cx, Oy, 0z) are 


CI CI IN ES 


Eigenvectors of s, are (cigenvalue +3), and (2) (eigenvalue — 4). 


1 
0 

Interaction with electromagnetic field: Particle of charge q in electromag- 
netic vector potential A 


pob-aA] (A.14) 
Thus 
p- 444 = ¡2 (A.15) 
and so 
pri LA. Vy + Ay = ¡2 (A.16) 


Note: (i) chosen gauge V - A = 0; (ii) q? term is usually neglected. 

Example: Magnetic field along z-axis, possible A consistent with V-A = 0 
is A = ¿B(-y, 2,0) such that Vx A = (0,0, B). Inserting this into the second 
term on left-hand side of (A.16) gives 


=-L.p (A.17) 


which generalizes to the standard orbital magnetic moment interaction — fi - 
Bw where 
E B > 
2m 


(A.18) 


A. Non-relativistic Quantum Mechanics 


Time-dependent perturbation theory: 


H=y+V 
A Ow 
Ay =i—. 
vata 
Unperturbed problem: 
Houn = EnUn. 


Completeness: 


ylz, t) = Y an(tjun (x) ®t. 


First-order perturbation theory: 


ag = «iff dg dt už (xje EtV (a, t)uj(a)e E 


which has the form 
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(A.19) 
(A.20) 


(A.21) 


(A.22) 


(A.23) 


ag = —i i] (volume element) (final state)“ (perturbing potential) (initial state) 


Important examples: 
(i) V independent of t: 
as = —iVa2rð(Er — Ei) 


where 


Va = J dar uf (e) (a)u (æ). 


(ii) Oscillating time-dependent potential: 


(a) if Ý ~ e~*, time integral of ag is 
Ju e HER ¿it iBit — 2rô( Ep — Ej — w) 


i.e. the system has absorbed energy from potential; 
(b) if V ~ etist, time integral of ag is 


Jaa ae = 2rô( Er pwe Ei) 


i.e. the potential has absorbed energy from system. 


(A.24) 


(A.25) 


(A.26) 


(A.27) 


(A.28) 
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Absorption and emission of photons: For electromagnetic radiation, far 
from its sources, the vector potential satisfies the wave equation 


V2A — vA = 0, (A.29) 
Solution: 
A(a,t) = Ao exp(—iwt + ik - x) + Aj exp(+iwt — ik- æ). (A.30) 
With gauge condition V - A = 0 we have 
k- Ao =0 (A.31) 


and there are two independent polarization vectors for photons. 
Treat the interaction in first-order perturbation theory: 


V (a, t) = (ig/m)A(a, t) V. (A.32) 
Thus 


Ag exp(—iwt +ik-a) = absorption of photon of energy w 
Aj exp(t+iwt+ik-a) = emission of photon of energy w. (A.33) 


B 


Natural Units 


In particle physics, a widely adopted convention is to work in a system of 
units, called natural units, in which 


h=c=1. (B.1) 


This avoids having to keep track of untidy factors of h and c throughout a 
calculation; only at the end is it necessary to convert back to more usual units. 
Let us spell out the implications of this choice of c and A. 

(i) c=1. In conventional MKS units c has the value 


c=3 x 10 me". (B.2) 


By choosing units such that 
c=1 (B.3) 


since a velocity has the dimensions 
le) = [L][T}-* (B.4) 


we are implying that our unit of length is numerically equal to our unit of 
time. In this sense, length and time are equivalent dimensions: 


[L] = [T]. (B.5) 
Similarly, from the energy-momentum relation of special relativity 
E? = p*c? + m?ct (B.6) 


we see that the choice of c = 1 also implies that energy, mass and momentum 

all have equivalent dimensions. In fact, it is customary to refer to momenta 

in units of ‘MeV/c’ or ‘GeV/c’; these all become ‘MeV’ or ‘GeV’ when c = 1. 
(ii) A = 1. The numerical value of Planck’s constant is 


ħ = 6.6 x 10-22 MeV s (B.7) 

and h has dimensions of energy multiplied by time so that 
[i = MLT. (B.8) 
Setting h = 1 therefore relates our units of [M], [L] and [T]. Since [L] and 
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[T] are equivalent by our choice of c = 1, we can choose [M] as the single 
independent dimension for our natural units: 


M] = [L] = [T]. (B.9) 


An example: the pion Compton wavelength How do we convert from natu- 
ral units to more conventional units? Consider the pion Compton wavelength 


Ar = h/ Mace (B.10) 
evaluated in both natural and conventional units. In natural units 
Ar = 1/Ma (B.11) 


where M ~ 140 MeV/c?. In conventional units, using M,,h (B.7) and c 
(B.2), we have the familiar result 


Aq = 1:41 im (B.12) 
where the ‘fermi’ or femtometre, fm, is defined as 
1 fm = 10-15 m. 


We therefore have the correspondence 


An = 1/M, = 1.41 fm. (B.13) 


Practical cross section calculations: An easy-to-remember relation may be 


derived from the result 
he ~ 200 MeV fm (B.14) 


obtained directly from (B.2) and (B.7). Hence, in natural units, we have the 
relation 


1 —1 


1 fm = 


Cross sections are calculated without h's and c’s and all masses, energies and 
momenta typically in MeV or GeV. To convert the result to an area, we merely 
remember the dimensions of a cross section: 


[o] = [1]? = M]. (B.16) 
If masses, momenta and energies have been specified in GeV, from (B.15) we 


derive the useful result (from the more precise relation fic = 197.328 MeV fm) 


NE 3 
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where a millibarn, mb, is defined to be 
1 mb = 107°! m?. 


Note that a ‘typical’ hadronic cross section corresponds to an area of about 


A2 where 
A = 1/M2 = 20 mb. 


Electromagnetic cross sections are an order of magnitude smaller: specifically 
for lowest order ete — putu” 


o x — nb (B.18) 


where s is in (GeV)? (see problem 8.18(d) in chapter 8). 


Taylor & Francis 
Taylor & Francis Group 


http://taylorandfrancis.com 


C 


Maxwell’s Equations: Choice of Units 


In high-energy physics, it is not the convention to use the rationalized MKS 
system of units when treating Maxwell’s equations. Since the discussion is 
always limited to field equations in vacuo, it is usually felt desirable to adopt 
a system of units in which these equations take their simplest possible form 
— in particular, one such that the constants ey and o, employed in the MKS 
system, do not appear. These two constants enter, of course, via the force 
laws of Coulomb and Ampere, respectively. These laws relate a mechanical 
quantity (force) to electrical ones (charge and current). The introduction of 
€o in Coulomb’s law 

— 192° 

~ Aregr3 


(C.1) 


enables one to choose arbitrarily one of the electrical units and assign to it 
a dimension independent of those entering into mechanics (mass, length and 
time). If, for example, we use the coulomb as the basic electrical quantity 
(as in the MKS system), eo has dimension (coulomb)? [T]?/[M][L]?. Thus 
the common practical units (volt, ampère, coulomb, etc) can be employed 
in applications to both fields and circuits. However, for our purposes this 
advantage is irrelevant, since we are only concerned with the field equations, 
not with practical circuits. In our case, we prefer to define the electrical units 
in terms of mechanical ones in such a way as to reduce the field equations to 
their simplest form. The field equation corresponding to (C.1) is 


V- E= p/e (Gauss’ law: MKS) (C.2) 


and this may obviously be simplified if we choose the unit of charge such that €o 
becomes unity. Such a system, in which CGS units are used for the mechanical 
quantities, is a variant of the electrostatic part of the ‘Gaussian CGS’ system. 
The original Gaussian system set co — 1/47, thereby simplifying the force 
law (C.1), but introducing a compensating 47 into the field equation (C.2). 
The field equation is, in fact, primary, and the 47 is a geometrical factor 
appropriate only to the specific case of three dimensions, so that it should 
not appear in a field equation of general validity. The system in which ey in 
(C.2) may be replaced by unity is called the ‘rationalized Gaussian CGS’ or 
‘Heaviside—Lorentz’ system: 


V.E=p (Gauss’ law; Heaviside—Lorentz). (C.3) 
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Generally, systems in which the 47 factors appear in the force equations rather 
than the field equations are called ‘rationalized’. 

Of course, (C.3) is only the first of the Maxwell equations in Heaviside— 
Lorentz units. In the Gaussian system, y in Ampére’s force law 


path | | îm) dr, dir, (C.4) 
4T Ti 


was set equal to 47, thereby defining a unit of current (the electromagnetic 
unit or Biot (Bi emu)). The unit of charge (the electrostatic unit or Franklin 
(Fr esu)) has already been defined by the (Gaussian) choice eo = 1/47 and 
currents via up — 47, and c appears explicitly in the equations. In the 
rationalized (Heaviside—Lorentz) form of this system, co — 1 and uo — 1, and 
the remaining Maxwell equations are 


10B 


YD =0 (C.6) 
. LOE 


A further discussion of units in electromagnetic theory is given in Panofsky 
and Phillips (1962, appendix I). 

Finally, throughout this book we have used a particular choice of units for 
mass, length and time such that h = c = 1 (see appendix B). In that case, the 
Maxwell equations we use are as in (C.3), (C.5)-(C.7), but with c replaced by 
unity. 

As an example of the relation between MKS and the system employed in 
this book (and universally in high-energy physics), we remark that the fine 
structure constant is written as 


e2 


a= in MKS units (C.8) 
4reohe 


or as 


a = — in Heaviside-Lorentz units with h = c = 1. (C.9) 
T 


Clearly the value of a(~ 1/137) is the same in both cases, but the numerical 
values of ʻe’ in (C.8) and in (C.9) are, of course, different. 

The choice of rationalized MKS units for Maxwell ’s equations is a part of 
the SI system of units. In this system of units the numerical values of uo and 
€o are 

po = 4r x 107 (kg m C2=Hm 1) 


and, since geo = 1/c?, 


107 1 


= = 2 2 koa! -3 _ F —1 , 
A4rc2 36r x 109 eke” am w 


€0 


D 


Special Relativity: Invariance and Covariance 


The co-ordinate 4-vector x” is defined by 
x” = (x?, x1, x°, x°) 


where x° = t (with c = 1) and (z!,2?, 23) = æ. Under a Lorentz transforma- 
tion along the x!-axis with velocity v, z” transforms to 


z” = q(x? — vx") 

xr” = (uz + zx!) 

ye ga 

a = a (D.1) 


where y = (1 — v2)-1/2, 

A general ‘contravariant 4-vector’ is defined to be any set of four quantities 
Ah = (40, Al, A?, 43) = (49, A) which transform under Lorentz transforma- 
tions exactly as the corresponding components of the coordinate 4-vector x". 
Note that the definition is phrased in terms of the transformation property 
(under Lorentz transformations) of the object being defined. An important 
example is the energy-momentum 4-vector p” = (E,p), where for a parti- 
cle of rest mass m, E = (p? + m?)!/?. Another example is the 4-gradient 
Ə! = (0°, —V) (see problem 2.1) where 


o 0 0 0 
0—2 = E Ei O |3 D.2 
i Ot Y (222 25) =o 


Lorentz transformations leave the expression A°? — A? invariant for a general 
4-vector A“. For example, E? — p? = m? is invariant, implying that the rest 
mass m is invariant under Lorentz transformations. Another example is the 
four-dimensional invariant differential operator analogous to V7, namely 


o = 902 as Vv? 
which is precisely the operator appearing in the massless wave equation 
Od = 024 — V’ =0. 


The expression 402 — A? may be regarded as the scalar product of A” with 
a related ‘covariant vector’ A, = (A°,—A). Then 


A? A? =X AMA, 
H 
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where, in practice, the summation sign on repeated ‘upstairs’ and ‘downstairs’ 
indices is always omitted. We shall often shorten the expression ‘A Ap” even 
further, to ‘A’; thus p? = E? — p? = m?. The ‘downstairs’ version of O! is 


O, = (0%, V). Then 9,0" = 0? = O. ‘Lowering’ and ‘raising’ indices is effected 
by the metric tensor g*” or gu», where g% = goo = 1, gt’ = g”? = g” 
gu. = 922 = 933 = —1, all other components vanishing. Thus if A, = Yu A” 


then Ag = A0, A, = —Al, etc. 
In the same way, the scalar product A - B of two 4-vectors is 


A-B=A"B, = A°B°-A-B (D.3) 


and this is also invariant under Lorentz transformations. For example, the 
invariant four-dimensional divergence of a 4-vector j” = (p,7) is 


Pi = Pp—(-V)-G=OPp+V- j = dj" (D.4) 


since the spatial part of 0% is —V. 

Because the Lorentz transformation is linear, it immediately follows that 
the sum (or difference) of two 4-vectors is also a 4-vector. In a reaction of the 
type ‘1 +2 —>3+4+--- N’ we express the conservation of both energy and 
momentum as one ‘4-momentum conservation equation’: 


pt + ps = 73 + py ++ Dy. (D.5) 


In practice, the 4-vector index on all the p’s is conventionally omitted in 
conservation equations such as (D.5), but it is nevertheless important to re- 
member, in that case, that it is actually four equations, one for the energy 
components and a further three for the momentum components. Further, it 
follows that quantities such as (pı +p2)?, (pı — p3)? are invariant under Lorentz 
transformations. 

We may also consider products of the form A*B”, where A and B are 
4-vectors. As u and v each run over their four possible values (0,1, 2,3) 
16 different ‘components’ are generated (A°B°, 40B!,..., ABB3). Under a 
Lorentz transformation, the components of A and B will transform into def- 
inite linear combinations of themselves, as in the particular case of (D.1). It 
follows that the 16 components of A“ B” will also transform into well-defined 
linear combinations of themselves (try it for A°B! and (D.1)). Thus we have 
constructed a new object whose 16 components transform by a well-defined 
linear transformation law under a Lorentz transformation, as did the compo- 
nents of a 4-vector. This new quantity, defined by its transformation law, is 
called a tensor — or more precisely a ‘contravariant second-rank tensor’, the 
‘contravariant’ referring to the fact that both indices are upstairs, the ‘second 
rank’ meaning that it has two indices. An important example of such a tensor 
is provided by 0“ A” (x) — 0” A” (x), which is the electromagnetic field strength 
tensor F*”, introduced in chapter 2. More generally we can consider ten- 
sors B*” which are not literally formed by ‘multiplying’ two vectors together, 
but which transform in just the same way; and we can introduce third- and 
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higher-rank tensors similarly, which can also be ‘mixed’, with some upstairs 
and some downstairs indices. 

We now state a very useful and important fact. Suppose we ‘dot’ a down- 
stairs 4-vector A, into a contravariant second-rank tensor BY”, via the oper- 
ation A,,B*”, where as always a sum on the repeated index p is understood. 
Then this quantity transforms as a 4-vector, via its ‘loose’ index v. This is 
obvious if B*” is actually a product such as B#” = CY D” , since then we have 
A, BY’ =(A-C)D”, and (A-C) is an invariant, which leaves the 4-vector DY 
as the only ‘transforming’ object left. But even if B*” is not such a product, 
it transforms under Lorentz transformations in exactly the same way as if it 
were, and this leads to the same result. An example is provided by the quan- 
tity 0, F"” which enters on the left-hand side of the Maxwell equations in the 
form (2.18). 

This example brings us conveniently to the remaining concept we need to 
introduce here, which is the important one of ‘covariance’. Referring to (2.18), 
we note that it has the form of an equality between two quantities (0, F"” on 
the left, 7%, on the right) each of which transforms in the same way under 
Lorentz transformations — namely as a contravariant 4-vector. One says that 
(2.18) is ‘Lorentz covariant’, the word ‘covariant’ here meaning precisely that 
both sides transform in the same way (i.e. consistently) under Lorentz trans- 
formations. Confusingly enough, this use of the word ‘covariant’ is evidently 
quite different from the one encountered previously in an expression such as 
‘a covariant 4-vector’, where it just meant a 4-vector with a downstairs index. 
This new meaning of ‘covariant’ is actually much better captured by an alter- 
native name for the same thing, which is ‘form invariant’, as we will shortly 
see. 

Why is this idea so important? Consider the (special) relativity principle, 
which states that the laws of physics should be the same in all inertial frames. 
The way in which this physical requirement is implemented mathematically 
is precisely via the notion of covariance under Lorentz transformations. For, 
consider how a law will typically be expressed. Relative to one inertial frame, 
we set up a coordinate system and describe the phenomena in question in 
terms of suitable coordinates, and such other quantities (forces, fields, etc) as 
may be necessary. We write the relevant law mathematically as equations re- 
lating these quantities, all referred to our chosen frame and coordinate system. 
What the relativity principle requires is that these relationships — these equa- 
tions — must have the same form when the quantities in them are referred to 
a different inertial frame. Note that we must say ‘have the same form’, rather 
than ‘be identical to’, since we know very well that coordinates, at least, are 
not identical in two different inertial frames (cf (D.1)). This is why the term 
‘form invariant’ is a more helpful one than ‘covariant’ in this context, but the 
latter is more commonly used. 

A more elementary example may be helpful. Consider Newton’s law in the 
simple form F = mr. This equation is ‘covariant under rotations’, meaning 
that it preserves the same form under a rotation of the coordinate system — 
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and this in turn means that the physics it expresses is independent of the 
orientation of our coordinate axes. The ‘same form’ in this case is of course 
just F” = më’. We emphasize again that the components of F’ are not the 
same as those of F, nor are the components of +” the same as those of F; 
but the relationship between F” and î is exactly the same as the relationship 
between F and r, and that is what is required. 

It is important to understand why this deceptively simple result ((F’ = 
mi”) has been obtained. The reason is that we have assumed (or asserted) 
that ‘force’ is in fact to be represented mathematically as a 3-vector quantity. 
Once we have said that, the rest follows. More formally, the transformation 
law of the components of r is ri = R,jr; (sum on j understood), where the 
matrix of transformation coefficients R is ‘orthogonal’ (RR! = RIR = I), 
which ensures that the length (squared) of r is invariant , r? = r/?. To say 
that ‘force is a 3-vector’ then implies that the components of F transform 
by the same set of coefficients Rij: F? = RijFj. Thus starting from the 
law Fj = mr; which relates the components in one frame, by multiplying 
both sides of the equation by R;; and summing over j we arrive at F? = mi, 
which states precisely that the components in the primed frame bear the same 
relationship to each other as the components in the unprimed frame did. This 
is the property of covariance under rotations, and it ensures that the physics 
embodied in the law is the same for all systems which differ from one another 
only by a rotation. 

In just the same way, if we can write equations of physics as equalities 
between quantities which transform in the same way (i.e. ‘are covariant’) under 
Lorentz transformations, we will guarantee that these laws obey the relativity 
principle. This is indeed the case in the Lorentz covariant formulation of 
Maxwell’s equations, given in (2.18), which we now repeat here: O, PH” = ja. 
To check covariance, we follow essentially the same steps as in the case of 
Newton’s equations, except that the transformations being considered are 
Lorentz transformations. Inserting the expression (2.19) for F*”, the equation 
can be written as (0,,0")A” — 0"(9,, A) = ju. The two quantities enclosed 
in parentheses are actually invariants, as was mentioned earlier. This means 
that 0,0" is equal to 0,,'0"" , and similarly ð, A” = ð,’ A'F, so that we can 
write the equation as (0/,0'")A” — 8” (0, A'") = ju. It is now clear that if 
we apply a Lorentz transformation to both sides, A” and 0” will become A” 
and 0’ respectively, while j¥,, will become j’%,,, since all these quantities 
are 4-vectors, transforming the same way (as the 3-vectors did in the Newton 
case). Thus we obtain just the same form of equation, written in terms of the 
‘primed frame’ quantities, and this is the essence of (Lorentz transformation) 
covariance. 

Actually, the detailed ‘check’ that we have just performed is really unnec- 
essary. All that is required for covariance is that (once again!) both sides of 
equations transform the same way. That this is true of (2.18) can be seen ‘by 
inspection’, once we understand the significance (for instance) of the fact that 
the u indices are ‘dotted’ so as to form an invariant. This example should 
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convince the reader of the power of the 4-vector notation for this purpose: 
compare the ‘by inspection’ covariance of (2.18) with the job of verifying 
Lorentz covariance starting from the original Maxwell equations (2.1), (2.2), 
(2.3) and (2.8)! The latter involves establishing the rather complicated trans- 
formation law for the fields E and B (which, of course, form parts of the 
tensor F*”), One can indeed show in this way that the Maxwell equations 
are covariant under Lorentz transformations, but they are not manifestly (i.e. 
without doing any work) so, whereas in the form (2.18) they are. 
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E 


Dirac 6-Function 


Consider approximating an integral by a sum over strips Az wide as shown 
in figure E.1: 


xa 
il Fa) dz = 5 f(z Ar. (E.1) 
21 i 
Consider the function 6(z — xj) shown in figure E.2, 
1/Az in the jth interval 
d(z — zj) = E.2 
(2 = 04) { 0 all others a 


Clearly this function has the properties 
YC <a) Ap = (53) (E.3) 


and 


SN bzi — zj)Az = 1. (E.4) 


In the limit as we pass to an integral form, we might expect (applying (E.1) 
to the left-hand sides) that these equations reduce to 


La 


flojó(a — xj) de = f(x;) (E.5) 


Li 


and 


y d(1— x,)du = 1 (E.6) 


provided that xı < zj < x2. Clearly such ‘é-functions’ can easily be general- 
ized to more dimensions, e.g. three dimensions: 


dV = dz dy dz = dr d(r — rj) = 6(z — 2,;)d(y — y;¡)0(z— 2). (E.7) 


Informally, therefore, we can think of the 6-function as a function that is zero 
everywhere except where its argument vanishes — at which point it is infinite 
in such a way that its integral has unit area, and equations (E.5) and (E.6) 
hold. Do such amazing functions exist? In fact, the informal idea just given 
does not define a respectable mathematical function. More properly the use 
of the “9-function” can be justified by introducing the notion of ‘distributions’ 
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fx 2 


xy x; Xa x 
FIGURE E.1 
Approximate evaluation of integral. 
Sexy) 
TA ias 
— < Ax 
l l Ly 
X, x 
J 
FIGURE E.2 


The function 6(a — 25). 


or ‘generalized functions’. Roughly speaking, this means we can think of the 
'S-function” as the limit of a sequence of functions, whose properties converge 
to those given here. The following useful expressions all approximate the 


E. Dirac 6-Function 379 


> 
x 
FIGURE E.3 
The function (E.10) for finite N. 
6-function in this sense: 
li : f /2<x<€/2 
(a) = (mn: Ban (ES) 
0 for |x| > e/2 
LP. € 
10 ere ee 
1 sin( N 
ô(x) = lim ee) aA (E.10) 


N—oo T IX 


The first of these is essentially the same as (E.2), and the second is a ‘smoother’ 
version of the first. The third is sketched in figure E.3: as N tends to infin- 
ity, the peak becomes infinitely high and narrow, but it still preserves unit 
area. 


Usually, under integral signs, 6-functions can be manipulated with no dan- 
ger of obtaining a mathematically incorrect result. However, care must be 
taken when products of two such generalized functions are encountered. 
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Resumé of Fourier series and Fourier transforms 


E. Dirac 6-Function 


Fourier’s theorem asserts that any suitably well-behaved periodic function 


with period L can be expanded as follows: 
Fx) = Xa aj bl 
Using the orthonormality relation 
L/2 
F e 2rimge/L?rinz/L dz = ômn 
L J_ije 
with the Kronecker 0-symbol defined by 


5 = 1 ifm=n 
m™ 10 ifmAzn 


the coefficients in the expansion may be determined: 


1 a —2rima/L 
am = > f(a)e dz. 
L J_1/2 


Consider the limit of these expressions as L — oo. We may write 


fin = Y F,An 


with 
F, = a erina/L 
and the interval An = 1. Defining 
27n/L =k 
and 
La, = g(k) 


we can take the limit L — oo to obtain 


/ F,, dn 


F g(k)e*® Ldk 


f(x) 


Des L 27 
Thus 


flo) = 5 [lke ak 


(E.11) 


(E.12) 


(E.13) 


(E.14) 


(E.15) 


(E.16) 


(E.17) 


(E.18) 


(E.19) 


(E.20) 
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and similarly from (E.14) 


aj. Fame da. (E.21) 


These are the Fourier transform relations, and they lead us to an important 
representation of the Dirac 6-function. 
Substitute g(k) from (E.21) into (E.20) to obtain 


fu) = + L dk eik? E da’ ei% f(a’). (E.22) 


Reordering the integrals, we arrive at the result 


= f Te (a f Ñ ge ak) (E.23) 


valid for any function f(x). Thus the expression 


= ekl(e=2") dk (E.24) 
27 
has the remarkable property of vanishing everywhere except at x = x’, and 
its integral with respect to a over any interval including x is unity (set f = 1 
n (E.23)). In other words, (E.24) provides us with a new representation of 
the Dirac 6-function: 

(2) = = =|. ei“? dk. (E.25) 

Qn 

Equation (E.25) is very important. It is the representation of the 6- 
function which is most commonly used, and it occurs throughout this book. 
Note that if we replace the upper and lower limits of integration in (E.25) by 
N and —N, and consider the limit N — oo, we obtain exactly (E.10). 

The integral in (E.25) represents the superposition, with identical uni- 
form weight (27)~1, of plane waves of all wavenumbers. Physically it may 
be thought of (cf (E.20)) as the Fourier transform of unity. Equation (E.25) 
asserts that the contributions from all these waves cancel completely, unless 
the phase parameter x is zero — in which case the integral manifestly diverges 
and ‘d(0) is infinity’ as expected. The fact that the Fourier transform of a 
constant is a 6-function is an extreme case of the bandwidth theorem from 
Fourier transform theory, which states that if the (suitably defined) ‘spread’ in 
a function g(k) is Ak, and that of its transform f(x) is Az, then AxAk > 3. 
In the present case Ak is tending to infinity and Az to zero. 

One very common use of (E.25) refers to the normalization of plane-wave 
states. If we rewrite it in the form 


00 eik z ika 
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we can interpret it to mean that the wavefunctions ei? /(27)1/2 and eif 2 /(27)1/2 
are orthogonal on the real axis —oo < x < oo for k Æ k’ (since the left-hand 
side is zero), while for k = k’ their overlap is infinite, in such a way that the 
integral of this overlap is unity. This is the continuum analogue of orthonor- 
mality for wavefunctions labelled by a discrete index, as in (E.12). We say that 
the plane waves in (E.26) are ‘normalized to a 6-function’. There is, however, 
a problem with this: plane waves are not square integrable and thus do not 
strictly belong to a Hilbert space. Mathematical physicists concerned with 
such matters have managed to deal with this by introducing ‘rigged’ Hilbert 
spaces in which such a normalization is legitimate. Although we often, in the 
text, appear to be using ‘box normalization’ (i.e. restricting space to a finite 
volume V), in practice when we evaluate integrals over plane waves the limits 
will be extended to infinity, and results like (E.26) will be used repeatedly. 
Important three- and four-dimensional generalizations of (E.25) are: 


J EL d5k = (27)%5(2) (E.27) 
and 

J eW? dik = (2r) S(x) (E.28) 
where k- x = k°x° — k - a (see appendix D) and d(x) = 6(x°)d(z). 
Properties of the ô-function 


The basic properties of the 6-function are exemplified by the equations (see 
(E.5) and (E.6)) 


P d(a—a)da=1, 6(x—a)=0 fora Fa, (E.29) 
where a is any real number; and 
L. Fa) (x — a) dz = f(a), (E.30) 
where f(a) is any continuous function of x. Other useful properties follow: 
(i) (ax) = go (E.31) 
Proof 
For a > 0, 


J soa] 0y (E.32) 


|. dr 1 ta |. sy =+. ms 


00 oo a Jo la] la 
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(ii) d(x) = 6(—2) i.e. an even function. (E.34) 


Proof 


f(0) = fiore) dz. (E.35) 
If f(x) is an odd function, f(0) = 0. Thus 0(x) must be an even function. 
(ii) 50) =D a dle = a) (E.36) 
where a; are the roots of f(x) = 0. 


Proof 


The 6-function is only non-zero when its argument vanishes. Thus we are 
concerned with the roots of f(x) = 0. In the vicinity of a root 


flai) =0 (E.37) 


we can make a Taylor expansion 


Ha) = fa + (2- as) (£) PEON (E.38) 


way 


Thus the 6-function has non-zero contributions from each of the roots a; of 


the form 
ste) =D E =a (£) f | | (E.39) 


Hence (using property (i)) we have 


1 
d(f(x)) = ae 00) (E.40) 
Consider the example 
(a? — a°). (E.41) 
Thus 
f(a) = 22 — a? = (x —a)(x +a) (E.42) 
with two roots x = +a (a > 0), and df/da = 2x. Hence 
(aris O (E.43) 
2a 
(iv) xó(x) = 0. (E.44) 


This is to be understood as always occurring under an integral. It is obvious 
from the definition or from property (ii). 
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(v) | torea -ro (E.45) 
where à 

(xa) = qee). (E.46) 
Proof 


f "Harot e Flojla) de + a, 


(E.47) 
since the second term vanishes. 
(vi) Pa S(x’ — a) dx’ = 0(z — a) (E.48) 
where i i 
OT 
j= E.4 
du) i for x > 0 ee) 
is the so-called '9-function”. 
Proof 
For «> a, - 
J ô(x' — a) da’ = 1; (E.50) 
for <a, 
J d(2' — a) da’ = 0. (E.51) 
By a simple extension it is easy to prove the result 
T2 
J ô(x — a) dx = 0(12 — a) — O(a, — a). (E.52) 
(vii) d(a — y) O(a — 2) = O(a — y) y — z). (E.53) 


Proof 


Take any continuous function of z, f(z). Then 
N gemmae e (E.54) 
= fly n= fi f(z)dz{d(a — y) y — z)}. (E.55) 


Thus the two sides of (vii) are equivalent as factors in an integrand with z as 
the integration variable. 
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Exercise 


Use property (iii) plus the definition of the 0-function to perform the p° inte- 
gration and prove the useful phase space formula 


Joso? myo) = | appre (E.56) 
where 
pP =P -p (E.57) 
and 
E = +(p? +m2)*?. (E.58) 


The relation (E.51) shows that the expression d*p/2E is Lorentz invariant: 
on the left-hand side, dtp and 6(p? — m?) are invariant, while 6(p°) depends 
only on the sign of p?, which cannot be changed by a ‘proper’ Lorentz trans- 
formation — that is, one that does not reverse the sense of time. 
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F 


Contour Integration 


We begin by recalling some relevant results from the calculus of real functions 
of two real variables z and y, which we shall phrase in ‘physical’ terms. Con- 
sider a particle moving in the xy-plane subject to a force F = (P(x,y), Q(x, y)) 
whose x- and y-components P and Q vary throughout the plane. Suppose the 
particle moves, under the action of the force, around a closed path C in the 
xy-plane. Then the total work done by the force on the particle, We, will be 
given by the integral 


We = $ Par = $ Pdr + Qay (F.1) 
c c 


where the $ sign means that the integration path is closed. Using Stokes’ 
theorem, we can rewrite (F.1) as a surface integral 


We =J) curl F -dS (F.2) 
S 


where S is any surface bounded by C (as a butterfly net is bounded by the rim). 
Taking S to be the area in the zy-plane enclosed by C, we have dS = dz dy k 


and 90 aP 
es pees rs) 


A mathematically special, but physically common, case is that in which F 
is a ‘conservative force’, derivable from a potential function V(x, y) (in this 
two-dimensional example) such that 


OV OV 
Pap) d Qe) = E (P4) 


the minus signs being the usual convention. In that case, it is clear that 


dP AQ 


and hence We in (F.3) is zero. The condition (F.5) is, in fact, both necessary 
and sufficient for We = 0. 
There can, however, be surprises. Consider, for example, the potential 


V(z,y) = —tan 1y/z. (F.6) 
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In this case the components of the associated force are 


OV att. sită ae: Aare 007 (F.7) 
Ox x? + y? Oy q? + y? 

Let us calculate the work done by this force in the case that C is the circle 
of unit radius centred on the origin, traversed in the anticlockwise sense. We 
may parametrize a point on this circle by (x = cos 0, y = sin 0), so that (F.1) 
becomes 


We = $ — sin 0(— sin 0 d0) + cos 0 (cos 0 d0) = f dð = 2r (F.8) 
E E 


a result which is plainly different from zero. The reason is that although this 
force is (minus) the gradient of a potential, the latter is not single-valued, in 
the sense that it does not return to its original value after a circuit round the 
origin. Indeed, the V of (F.6) is just —0, which changes by —27 on such a 
circuit, exactly as calculated in (F.8) allowing for the minus signs in (F.4). 
Alternatively, we may suspect that the trouble has to do with the ‘blow up’ 
of the integrand of (F.7) at the point x = y = 0, which is also true. 

Much of the foregoing has direct parallels within the theory of functions 
of a complex variable z = x + iy, to which we now give a brief and informal 
introduction, limiting ourselves to the minimum required in the text!. The 
crucial property, to which all the results we need are related, is analyticity. A 
function f(z) is analytic in a region R of the complex plane if it has a unique 
derivative at every point of R. The derivative at a point z is defined by the 
natural generalization of the real variable definition: 

To (ERRATA (F.9) 

dz Az—0 Az 
The crucial new feature in the complex case, however, is that ‘Az’ is actually 
an (infinitesimal) vector, in the xy (Argand) plane. Thus we may immedi- 
ately ask: along which of the infinitely many possible directions of Az are we 
supposed to approach the point z in (F.9)? The answer is: along any! This is 
the force of the word ‘unique’ in the definition of analyticity, and it is a very 
powerful requirement. 

Let f(z) be an analytic function of 2 in some region R, and let u and v 
be the real and imaginary parts of f: f = u + iv, where u and v are each 
functions of x and y. Let us evaluate df/dz at the point z = x + iy in two 
different ways, which must be equivalent. 

(i) By considering Az = Az (i.e. Ay = 0). In this case 


— = lim ¡€xz__Á A --=--_—__ -_>______E zz  _ _ __ _ -_ E _ _ _----AAA<< 
dz Az>0 Ax 
ðu ðv 
= Hi (F.10) 


from the definition of a partial derivative. 


lFor a fuller introduction, see for example Boas (1983, chapter 14). 
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(ii) By considering Az = iAy (i.e. Ax = 0). In this case 


df P [> — u(x, y) +iv(x, y + Ay) — we} 


= 
= 


dz E Ay>0 iAy 
Ov Ou 
=. === F.11 
Oy “dy ( ) 


ðu ðv Ou Ov 


cu ERE lt — =- F.12 
Ox Oy Oy Ou ( ) 
which are the necessary and sufficient conditions for f to be analytic. 
Consider now an integral of the form 
T= f f(z) dz (F.13) 
č 


where again the symbol $ means that the integration path (or contour) in the 
complex plane in closed. Inserting f = u + iv and z = x + iy, we may write 
(F.13) as 


T= $(ude — vdy) +i $ (odo +udy). (F.14) 


Thus the single complex integral (F.13) is equivalent to the two real-plane 
integrals (F.14); one is the real part of J, the other is the imaginary part, 
and each is of the form (F.1). In the first, we have P = u,Q = —v. Hence 
the condition (F.5) for the integral to vanish is 0u/0y = —0v/0x, which is 
precisely the second CR relation! Similarly, in the second integral in (F.14) 
we have P = v and Q = u so that condition (F.5) becomes 0v/0y = Ou/Oz, 
which is the first CR relation. It follows that if f(z) is analytic inside and on 
C, then 


fios- as 
(64 


a result known as Cauchy’s theorem, the foundation of complex integral cal- 
culus. 

Now let us consider a simple case in which (as in (F.7)) the result of 
integrating a complex function around a closed curve is not zero — namely the 
integral 

dz 


ec Z 


(F.16) 


where C is the circle of radius p enclosing the origin. On this circle, z = pe? 
where p is fixed and 0 < 0 < 27, so 


+ 10 
¿Eje pe =if d= ai. (F.17) 
c 2 c pe 
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Cauchy’s theorem does not apply in this case because the function being 
integrated (2-1) is not analytic at z = 0. Writing dz/z in terms of x and y 
we have 


dz — dw+idy  (x—iy) ; 
iai = yon 
uda + y dy . [ —y dz + x dy 
= ae (ee (F.18) 


The reader will recognize the imaginary part of (F.18) as involving precisely 
the functions (F.7) studied earlier, and may like to find the real potential 
function appropriate to the real part of (F.18). 

We note that the result (F.17) is independent of the circle’s radius p. This 
means that we can shrink or expand the circle how we like, without affecting 
the answer. The reader may like to show that the circle can, in fact, be dis- 
torted into a simple closed loop of any shape, enclosing z = 0, and the answer 
will still be 27i. In general, a contour may be freely distorted in any region 
in which the integrand is analytic. 

We are now in a position to prove the main integration formula we need, 
which is Cauchy’s integral formula: let f(z) be analytic inside and on a simple 
closed curve C which encloses the point z = a; then 


| Se) dz = 2nif (a) (F.19) 


20 


where it is understood that C is traversed in an anticlockwise sense around 
z = a. The proof follows. The integrand in (F.19) is analytic inside and on C, 
except at z = a; we may therefore distort the contour C by shrinking it into a 
very small circle of fixed radius p around the point z = a. On this circle, z is 
given by z = a + pei?, and 
27 i0) pi ai0 27 
IC) iaz Le si do= | f(atpe)idd.  (F.20) 
cz-a 0 pe 0 

Now, since f is analytic at z = a, it has a unique derivative there, and is 
consequently continuous at z = a. We may then take the limit p > 0 in 
(F.20), obtaining limpo f(a + pe?) = f(a), and hence 


(2) 


cea 


dz = f(a) | "ia = 2rif (a) (F.21) 


as stated. 

We now use these results to establish the representation of the 6-function 
(see (E.47)) quoted in section 6.3.2. Consider the function F(t) of the real 
variable t defined by 


i eTizt 


F(t) dz (F.22) 


27 C=C1+C2 Z + le 
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z = —ie 


z = —ie 


(a) (b) 


FIGURE F.1 
Contours for F(t): (a) t < 0; (b) t> 0. 


where e is an infinitesimally small positive number (i.e. it will tend to zero 
through positive values). The closed contour C is made up of Cı which is the 
real axis from —R to R (we shall let R — co at the end), and of Cə which is 
a large semicircle of radius R with diameter the real axis, in either the upper 
or lower half-plane, the choice being determined by the sign of t, as we shall 
now explain (see figure F.1). Suppose first that t < 0, and let z on C2 be 
parametrized as z = Rei? = Rcos@ + iRsin 0. Then 


eit = eizltl = e Psin 6|t| giR cos 0 |t| (F.23) 


from which it follows that the contribution to (F.22) from Cə will vanish 
exponentially as R — oo provided that 0 > 0, i.e. we choose Cz to be in 
the upper half-plane (figure F.1(a)). In that case the integrand of (F.22) is 
analytic inside and on C (the only non-analytic point is outside C at z = —ie) 
and so 

F(t) =0 for t < 0. (F.24) 


However, suppose t > 0. Then 


eit = eP sin Ot ,—iR cos Ot (F.25) 
and in this case we must choose the ‘contour-closing’ Ca to be in the lower 
half-plane (9 < 0) or else (F.25) will diverge exponentially as R —> oo. With 
this choice the C2 contribution will again go to zero as R — oo. However, 
this time the whole closed contour C does enclose the point z = —ie (see 
figure F.1(b)), and we may apply Cauchy’s integral formula to get, for t > 0, 


F(t) = -21i—e"*, (F.26) 
27 


the minus sign at the front arising from the fact (see figure F.1(b)) that C is 
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now being traversed in a clockwise sense around z = —ie (this just inverts the 
limits in (F.21)). Thus as € > 0, 


F(t)+1  fort>0. (F.27) 


Summarizing these manoeuvres, for t < 0 we chose Cə in (F.22) in the upper 
half-plane (figure F.1(a)), and its contribution vanished as R —> oo. In this 
case we have, as R— 00, 


A co —izt 
F(t) > =] ? dz=0  fort<0. (F.28) 
T 


oo ZF 1€ 


For t > 0 we chose C2 in the lower half-plane (figure F.1(b)), when again its 
contribution vanished as R —> co. However, in this case F does not vanish, 
but instead we have, as R > 00, 


A oo —izt 
F(t) > =] E dz=1  fort>0. (F.29) 
T 


e 2 +le6 


Equations (F.28) and (F.29) show that we may indeed write 


: OO a—izt 
6(t) = lim — J “dz (F.30) 


e>0 27 J- 2+1€ 


as claimed in section 6.3, equation (6.93). 


G 


Green Functions 


Let us start with a simple but important example. We seek the solution Go(r) 
of the equation 
V°Go(r) = ô(r). (G.1) 


There is a ‘physical’ way to look at this equation which will give us the answer 
straightaway. Recall that Gauss’ law in electrostatics (appendix C) is 


V.E=p/0 (G.2) 


and that E is expressed in terms of the electrostatic potential V as E = —VV. 
Then (G.2) becomes 
V?V = —p/eo (G.3) 


which is known as Poisson's equation. Comparing (G.3) and (G.1), we see 
that (—Go(r)/e0) can be regarded as the ‘potential’ due to a source p which 
is concentrated entirely at the origin, and whose total ‘charge’ is unity, since 
(see appendix E) 


fso) ered, (G.4) 
In other words, (—G/€o) is effectively the potential due to a unit point charge 


at the origin. But we know exactly what this potential is from Coulomb’s law, 
namely 


—Go(r) = 1 
€0 ~ Aner (58) 
whence i 
TY 


We may also check this result mathematically as follows. Using (G.6), 
equation (G.1) is equivalent to 


ve! = —4rô(r). (G.7) 


Let us consider the integral of both sides of this equation over a spherical 
volume of arbitrary radius R surrounding the origin. The integral of the 
left-hand side becomes, using Gauss’ divergence theorem, 


/ (v>) r= | y. (v+) ra v (2) Ads. (G.8) 
vV T V r S bounding V r 
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on the surface S, while ñ = 7 and dS = R? dQ with dQ the element of solid 
angle on the sphere. So 


[v (+) dir = - [ae = —4r (G.9) 


which using (G.4) is precisely the integral of the right-hand side of (G.7), as 
required. 
Consider now the solutions of 


(V? + k2)Gu(r) = 5(r). (G.10) 


We are interested in rotationally invariant solutions, for which G is a function 
of r = |r| alone. For r 4 0, equation (G.10) is easy to solve. Setting G(r) = 
f(r)/r, and using 

2 19309 o o 


+ parts depending on 30 and 96 


=ar ðr 
we find that f(r) satisfies 
af + k’ f = 
dr? Bi 
the general solution to which is (k = |k]) 
fir) = Aeikr + Bee, 
leading to 


eikr eTikr 


G,(r) =A = 


(G.11) 


for r 4 0. In the application to scattering problems (appendix H) we shall 
want Gg to contain purely outgoing waves, so we will pick the ‘A’-type solution 
in (G.11). 

Consider therefore the expression 


A ikr 
(V? +k?) (2) (G.12) 
r 
where r is now allowed to take the value zero. Making use of the vector 
operator result 
V*(f9) =(V" fg + 2VF Va + fV’) 
with ‘f? = et" and ‘g’ = 1/r, together with 


í ikr 
2 ik ik ikre 1 r 
ape Vel” = — V—=—— 


r r r3 


i Loikr 
o; 2ike 
Vier = 2 
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we find 
ikr 
(V? +k’) (2 ) = Ary (+) 
r 


r 


= —4nAe'* S(r) 
= —47Ad(r) (G.13) 


where we have replaced r by zero in the exponent of the last term of the 
last line in (G.13), since the 9-function ensures that only this point need be 
considered for this term. By choosing the constant A = —1/47, we find that 
the (outgoing wave) solution of (G.10) is 


eikr 


GW (r) = — (G.14) 


Amr. 


We are also interested in spherically symmetric solutions of (restoring c 
and h explicitly for the moment) 


(v = mZ) olr) = d(r) (G.15) 


which is the equation analogous to (G.1) for a static classical scalar potential 
of a field whose quanta have mass m. The solutions to (G.15) are easily found 
from the previous work by letting k — imc/h. Retaining now the solution 
which goes to zero as r — 00, we find 


(G.16) 


where a = h/mc, the Compton wavelength of the quantum, with mass m. The 
potential (G.16) is (up to numerical constants) the famous Yukawa potential, 
in which the quantity ‘a’ is called the range: as r gets greater than a, ¢(r) 
becomes exponentially small. Thus, just as the Coulomb potential is the solu- 
tion of Poisson’s equation (G.3) corresponding to a point source at the origin, 
so the Yukawa potential is the solution of the analogous equation (G.15), also 
with a point source at the origin. Note that as a > oo, p(r) > Go(r). 

Functions such as Gk, Go and ¢, which generically satisfy equations of the 
form 


0,-G(r) = d(r) (G.17) 


where Q, is some linear differential operator, are said to be Green functions of 
the operator Q,.. From the examples already treated, it is clear that G(r) in 
(G.17) has the general interpretation of a ‘potential’ due to a point source at 
the origin, when Q, is the appropriate operator for the field theory in question. 
Green functions play an important role in the solution of differential equa- 

tions of the type 
Q,w(r) = s(r) (G.18) 
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where s(r) is a known ‘source function’ (e.g. the charge density in (G.3)). 
The solution of (G.18) may be written as 


r)+ [Gros r’) dy! (G.19) 


where u(r) is a solution of Q,u(r) = 0. Thus once we know G, we have the 
solution via (G.19). 

Equation (G.19) has a simple physical interpretation. We know that G(r) 
is the solution of (G.18) with s(r) replaced by d(r). But by writing 


r)= fă (r — r')s(r’) dir! (G.20) 


we can formally regard s(r) as being made up of a superposition of point 
sources, distributed at points r’ with a weighting function s(r’). Then, since 
the operator Q, is (by assumption) linear, the solution for such a superposi- 
tion of point sources must be just the same superposition of the point source 
solutions, namely the integral on the right-hand side of (G.19). This integral 
term is, in fact, the ‘particular integral’ of the differential equation (G.18), 
while the u(r) is the ‘complementary function’. 

Equation (G.19) can also be checked analytically. First note that it is 
generally the case that the operator Q, is translationally invariant, so that 


Qr = Qrar; (G.21) 


the right-hand side of (G.21) amounts to shifting the origin to the point r”. 
Applying Q, to both sides of (G.19), we find 


Op (r) Q,u(r) + J 0,G(r — r')s(r') d3r’ 


= 0+ | ar-pGtr= roate Nabe! = for rar dar 


= s(r) 


as required in (G.18). 
Finally, consider the Fourier transform of equation (G.10), defined as 


foro + k2)Gy(r) dir = fts) d?r 


The right-hand side is unity, by equation (G.4). On the left-hand side we may 
use the result 


Juv) dr = [Puer 


(proved by integrating by parts, assuming u and v go to zero sufficiently fast 
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at the boundaries of the integral) to obtain 


ferry + k*)G,(r) Br fiva) + ke 9G, (r) dr 


fe + k?)e 97 G(r) dir 
= (-q’? +k*)G4(g) 


where G(q) is the Fourier transform of G(r). Since this expression has to 


equal unity, we have 
~ iL 
Gi.(q) = Bog 


There is, however, a problem with (G.22) as it stands, which is that it is 
undefined when the variable q? takes the value equal to the parameter k? in 
the original equation. Indeed, various definitions are possible, corresponding 
to the type of solution in r-space for G(r) (i.e. ingoing, outgoing or standing 
wave). It turns out (see the exercise at the end of this appendix) that the 


(G.22) 


specification which is equivalent to the solution G(r) in (G.14) is to add 
an infinitesimally small imaginary part in the denominator of (G.22): 


~ 1 
E (a) = Pog ae (G.23) 


In exactly the same way, the Fourier transform of ¢(r) satisfying (G.15) is 


x —1 
==- G.24 
bla) = am (G.24) 
where we have reverted to units such that h=c=1. 
The relativistic generalization of this result is straightforward. Consider 
the equation 


(O + m2)G(x) = —6(2) (G.25) 


where x is the coordinate 4-vector and 6(2) is the four-dimensional d-function, 
5(x°)6(a); the sign in (G.25) has been chosen to be consistent with (G.15) in 
the static case. Taking the four-dimensional Fourier transform, and making 
suitable assumptions about the vanishing of G at the boundary of space-time, 
we obtain 


(—q? + m2)G(q) = —1 (G.26) 
where 
Coe J eG (a) dia 
and so i 
G(q) = e (G.27) 


As we have seen in detail in chapter 6, the Feynman prescription for selecting 
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the physically desired solution amounts to adding an ‘ie’ term in the denomi- 
nator of (G.27): 

1 
2 m? + ie 


AH) (9) = 
(q) 7 


(G.28) 


Exercise 


Verify the ‘ie’ specification in (G.23), using the methods of appendix F. [Hint: 
You need to show that the Fourier transform of (G.23), defined by 


1 
Cr 


EP (nr) = I 97 GH (q) dq, (G.29) 


is equal to G(r) of (G.14). Do the integration over the polar angles of q, 
taking the direction of r as the polar axis. This gives 


Hra -1 [9 (Àr — eir qdq 
Gi, (r) = =/ (2) E ar (G.30) 


— 00 


where q = |q|, r = |r|, and we have used the fact that the integrand is an even 
function of q to extend the lower limit to —oo, with an overall factor of 1/2. 
Now convert q to the complex variable z. Locate the poles of (z? — k? —ie)7* 
(compare the similar calculation in section 10.3.1, and in appendix F). Apply 
Cauchy’s integral formula (F.17), closing the ei?” part in the upper half z- 


plane, and the e part in the lower half z-plane.] 


—izr 
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Elements of Non-relativistic Scattering 
Theory 


H.1 Time-independent formulation and differential cross 
section 


We consider the scattering of a particle of mass m by a fixed spherically 
symmetric potential V(r); we shall retain î explicitly in what follows. The 
potential is assumed to go to zero rapidly as r — 00, as for the Yukawa 
potential (G.16); it will turn out that the important Coulomb case can be 
treated as the a > oo limit of (G.16). We shall treat the problem here as a 
stationary state one, in which the Schrödinger wavefunction y(r,t) has the 
form 


(r,t) = (rje tE" (H.1) 


where E is the particle’s energy, and where ¢(r) satisfies the equation 


[ev + vir) b(n) = Eg(r). (H.2) 


We shall take V to be spherically symmetric, so that V(r) = V(r) where 
r = |r|. In this approach to scattering, we suppose the potential to be ‘bathed’ 
in a steady flux of incident particles, all of energy E. The wavefunction for 
the incident beam, far from the region near the origin where V is appreciably 
non-zero, is then just a plane wave of the form Qin. = e'*?, where the z-axis 
has been chosen along the propagation direction, and where E = nk? /2m 
with k = (0,0,k). This plane wave is normalized to one particle per unit 
volume, and yields a steady-state flux of 


Tine = Imi [Pine V Pine T Pinc V Pine] 
m 


1 


= hk/m=p/m (H.3) 


where the momentum is p = hk. As expected, the incident flux is given by 
the velocity v per unit volume. 

Though we have represented the incident beam as a plane wave, it will, 
in practice, be collimated. We could, of course, superpose such plane waves, 
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with different k’s, to make a wave-packet of any desired localization. But 
the dimensions of practical beams are so much greater than the de Broglie 
wavelength \ = h/p of our particles, that our plane wave will be a very good 
approximation to a realistic packet. 

The form of the complete solution to (H.2), even in the region where V is 
essentially zero, is not simply the incident plane wave, however. The presence 
of the potential gives rise also to a scattered wave, whose form as r — 00 is 

eikr 


se = F(0, ¢) r 


(H.4) 


We shall actually derive this later, but its physical interpretation is simply 
that it is an outgoing (vel*” rather than e~'*") ‘spherical wave’, with a factor 
f (0, $) called the scattering amplitude that allows for the fact that even though 
V(r) is spherically symmetric, the solution, in general, will not be (recall 
the bound-state solutions of the Coulomb potential in the hydrogen atom). 
Calculating the radial component of the flux corresponding to (H.4) yields 


h fð 
sc oy 


PE poop. (55) 


Oo 
Jr,sc Ps Pse Dr Psc 


2mi 


The flux in the two non-radial directions will contain an extra power of r in 
the denominator — recall that 


„9 210 sl 19) 
a pan ae 


and so (H.5) represents the correct asymptotic form of the scattered flux. 
The cross section is now easily found. The differential cross section, do, 
for scattering into the element of solid angle dQ is defined by 


do = jrsc 45/ljinel (H.6) 
where dS = r? dQ, so that from (H.3) and (H.5) 


do 2 
aq = OAP. Ci) 


The total cross section is then just 


s= / IF (0, $)? 40. (H.8) 


It is important to realize that the complete asymptotic form of the solution 
to (H.2) is the superposition of @ing and (sc: 
ikr 


r—00 


olr) Se + (8, 4) 


(H.9) 


m . 
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Note that in the ‘forward direction’ (i.e. within a region close to the z-axis, as 
determined by the collimation), the incident and scattered waves will inter- 
fere. Careful analysis reveals a depletion of the incident beam in the forward 
direction (the ‘shadow’ of the scattering centre), which corresponds exactly 
to the total flux scattered into all angles (Gottfried 1966, section 12.3). This 
is expressed in the optical theorem: 


Im f(0) = mo (H.10) 


ra o See 


H.2 Expression for the scattering amplitude: Born 
approximation 


We begin by rewriting (H.2) as 
(V? + k5ó(r) = 5 V(r)d(r). (H.11) 


This equation is of exactly the form discussed in appendix G, e.g. equa- 
tion (G.18) with Q, = V?+k?. Further, we know that the Green function 
for this (2,, corresponding to the desired outgoing wave solution, is given by 
(G.14). Using then (G.19) and (G.14), we can immediately write the ‘formal 
solution’ of (H.11) as 


1 eiklr—r”] 


/ H Anl 


where we have chosen ‘u(r)’ in (G.19) to be the incident plane wave fine, and 
have used k -r = kz. We say ‘formal’ because of course the unknown ¢(r’) 
still appears on the right-hand side of (H.12). 

It may therefore seem that we have made no progress — but in fact (H.12) 
leads to a very useful expression for f(0,¢), which is the quantity we need to 
calculate. This can be found by considering the asymptotic (r — 00) limit of 
the integral term in (H.12). We have 


pr] = (24+r?-2r.r)2 
1 
~ rr. /r+0 (+) terms. (H.13) 
r 
Thus in the exponent we may write 
er] py gik(r—r-7'/r) = cir ik! Y” 


where k' = k? is the outgoing wavevector, pointing along the direction of the 
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outgoing scattered wave which enters dS. In the denominator factor we may 
simply say |r — r'|} = r7! since the next term in (H.13) will produce a 
correction of order r7?. Putting this together, we have 

ikr 


TOO ikz _ m e iki! i A 437! 
o(r) > e Sal = fe V(r')d(r') dr (H.14) 


from which follows the formula for f (0, ¢): 


m 
27h? 


f(0,¢) = ATREA ar. (H.15) 

No approximations have been made thus far, in deriving (H.15) — but of 
course it still involves the unknown ¢(r’) inside the integral. However, it is 
in a form which is very convenient for setting up a systematic approrimation 
scheme — a kind of perturbation theory — in powers of V. If the potential is 
relatively ‘weak’, its effect will be such as to produce only a slight distortion of 
the incident wave, and so p(r) ~ cik-T 4 ‘small correction’. This suggests that 
it may be a good approximation to replace ¢(r’) in (H.15) by the undistorted 
incident wave ihr” giving the approximate scattering amplitude 


fea(9,¢) = — fear very ate (H.16) 


m 
Qrh? 
where the wave vector transfer q is given by 

q=k-k’. (H.17) 


This is called the ‘Born approximation to the scattering amplitude’. The 
criteria for the validity of the Born approximation are discussed in many 
standard quantum mechanics texts. 

The approximation can be improved by returning to (H.12) for d(r), and 
replacing ¢(r’) inside the integral by ihr! just as we did in (H.16); this will 
give us a formula for the first-order (in V) correction to ¢(r). We can now 
insert this expression for ¢(r’) (i.e. p(r”) = iker” + O(V) correction) into 
(H.15), which will give us fga again as the first term, but also another term, 
of order V? (since V appears in the integral in (H.15)). By iterating the 
process indefinitely, the Born series can be set up, to all orders in V. 


ooo o o o ooo 


H.3 Time-dependent approach 


In this approach we consider the potential V(r) as causing transitions be- 
tween states describing the incident and scattered particles. From standard 
time-dependent perturbation theory in quantum mechanics, the transition 
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probability per unit time for going from state |i) to state |f), to first order in 
V, is given by 


Py = EVI (E le-ai (H.18) 


27 
F 
where p(Ef)dErf is the number of final states in the energy range dE; around 
the energy-conserving point E; = Ef. Equation (H.18) is often known as 
the ‘Golden Rule’. In the present case, if we adopt the same normalization 
as in the previous section, the initial and final states are represented by the 


wavefunction eE" and e-ik-T, so that 
VI) = DES =V(q). (H.19) 


Also, the number of such states in a volume element d°p’ of momentum space 
(p' = hk’) is d5p'/(27h)5. 

In spherical polar coordinates, with dQ standing for the element of solid 
angle around the direction (0, /¢) of p’, we have 


dp! = p° d|p'|dQ = m\p’| dE’ dQ (H.20) 
where we have used E’ = p'?/2m. It follows that 


dp! m 
ENdE'= —— = ——_|p'|dQ dE’ H.21 
p(E”) Gară oan d d (H.21) 


and so 
m 


is 
Inserting (H.19) and (H.22) into (H.18) we obtain, for this case, 


lp do. (H.22) 


. 21 - m 
Ps = —|V (q)? -— lp] da. H.23 
i= FPO plo (H.23) 
To get the cross section, we need to divide this expression by the incident flux, 
which is |p|/m as in (H.3). Thus the differential cross section for scattering 
into the element of solid angle dQ in the direction (0, ¢) is 


do = (55 LA do (H.24) 
o = (oz q ; ; 
Comparing (H.24) with (H.7) and (H.16), we see that this application of the 
Golden Rule (first-order time-dependent perturbation theory) is exactly equiv- 
alent to the Born approximation in the time-independent approach. It is, how- 
ever, the time-dependent approach which is much closer to the corresponding 
quantum field theory formulation we introduce in chapter 6. 
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The Schrodinger and Heisenberg Pictures 


The standard introductory formalism of quantum mechanics is that of Schró- 
dinger, in which the dynamical variables (such as æ and p = —iV) are inde- 
pendent of time, while the wavefunction 4 changes with time according to the 
general equation 


(1.1) 


where H is the Hamiltonian. Matrix elements of operators A depending on 
x,p... then have the form 


(61 Ap) = J $ (æ, t)Ay(a, t) Bx (12) 


and will, in general, depend on time via the time dependences of ¢ and Y. 
Although used almost universally in introductory courses on quantum me- 
chanics, this formulation is not the only possible one, nor is it always the 
most convenient. 

We may, for example, wish to bring out similarities (and differences) be- 
tween the general dynamical frameworks of quantum and classical mechanics. 
The formulation here does not seem to be well adapted to this purpose, since 
in the classical case the dynamical variables depend on time (a(t), p(t)...) 
and obey equations of motion, while the quantum variables Á are time- 
independent and the ‘equation of motion’ (1.1) is for the wavefunction vw, 
which has no classical counterpart. In quantum mechanics, however, it is 
always possible to make unitary transformations of the state vector or wave- 
functions. We can make use of this possibility to obtain an alternative for- 
mulation of quantum mechanics, which is in some ways closer to the spirit of 
classical mechanics, as follows. 

Equation (1.1) can be formally solved to give 


(a, t) = e ty(æ,0) (1.3) 


where the exponential (of an operator!) can be defined by the corresponding 
power series, for example: 


E A 1 ‘A 
gift 1-iHt+ ai (it)? +e. (1.4) 
It is simple to check that (1.3) as defined by (1.4), does satisfy (I.1) and that 
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the operator U = exp(—iHt) is unitary: 
Ut = [exp(-iHt)]' = exp(i Hit) = expli ft) = U7! (1.5) 


where the Hermitian property Ht = H has been used. Thus (1.3) can be 
viewed as a unitary transformation from the time-dependent wavefunction 
(x,t) to the time-independent one 4(x, 0). Correspondingly the matrix ele- 
ment (1.2) is then 


(6|A|) = I 6° (æ, 0)” deh (a, 0) d'2 (1.6) 


which can be regarded as the matrix element of the time-dependent operator 


A(t) = At py ión (1.7) 


between time-independent wavefunctions ¢*(a, 0), (2,0). 

Since (1.6) is perfectly general, it is clear that we can calculate amplitudes 
in quantum mechanics in either of the two ways outlined: (i) by using time- 
dependent ys and time-independent A’s, which is called the ‘Schrödinger 
picture’: or (ii) by using time-independent w’s and time-dependent A’s, which 
is called the ‘Heisenberg picture’. The wavefunctions and operators in the two 
pictures are related by (1.3) and (1.7). We note that the pictures coincide at 
the (conventionally chosen) time t = 0. 

Since A(t) is now time-dependent, we can ask for its equation of motion. 
Differentiating (1.7) carefully, we find (if A does not depend explicitly on t) 
that i(t) 

dA(t ae a 

E ilA(t), H] (1.8) 
which is called the Heisenberg equation of motion for A(t). On the right-hand 
side of (1.8), H is the Schrédinger operator; however, if H is substituted for 
A in (1.7), one finds H(t) = A, so H can equally well be interpreted as the 
Heisenberg operator. For simple Hamiltonians H , (1.8) leads to operator equa- 
tions quite analogous to classical equations of motion, which can sometimes 
be solved explicitly (see section 5.2.2 of chapter 5). 

The foregoing ideas apply equally well to the operators and state vectors 
of quantum field theory. 


J 


Dirac Algebra and Trace Identities 


J.1 Dirac algebra 
J.1.1 y matrices 


The fundamental anticommutator 
O = 2g” 
may be used to prove the following results. 


WF = A 
Why = 24 
yub = 4a-b 
utb" = —2¢b¢ 
dp = —P4+2a-b. 


As an example, we prove this last result: 


Ah = ab 
= apbu(—p y" + 2g”) 


= —Bd+2a-b. 


II 


J.1.2 y5 identities 


Define 
p= ry y 


In the usual representation with 


o /1 0 [0 o 
P=(4 -1 m Flen 


Ys is the matrix 


(J.1) 


NON 
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Either from the definition or using this explicit form it is easy to prove that 
y% =l (J.10) 


and 

{75,7} = 0 (J.11) 
i.e. ys anticommutes with the other y-matrices. Defining the totally antisym- 
metric tensor 


—1 for an odd permutation of 0, 1, 2, 3 (J.12) 


+1 for an even permutation of 0, 1, 2, 3 
Euvpo = 
0 if two or more indices are the same 


we may write 


i 

Y = Feuvpe PY. (3.13) 

With this form it is possible to prove 
i 


gg Erro YY Y? (J.14) 


Yo = 
and the identity 


Py = ge gP” t gP" e inp go. (J.15) 


J.1.3 Hermitian conjugate of spinor matrix elements 


[u(p’, s’)Pu(p, s)]' = a(p, s)Pu(p’, s’) (J.16) 
where T is any collection of y matrices and 
T = pri. (J.17) 
For example 
TF = +! (J.18) 
and 
yes = y 47. (J.19) 


J.1.4 Spin sums and projection operators 
Positive-energy projection operator: 

[A+ (plas = >> ualp, sJug(p, 8) = (p + Mas (J.20) 
Negative-energy projection operator: 


[A-(pllas = — Y valp, 8)¥6(p, 8) = (=p + m)ag. (J.21) 
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Note that these forms are specific to the normalizations 
uu = 2m vv = —2m (J.22) 


for the spinors. 


eA 


J.2 Trace theorems 


Tri. = 4 (theorem 1) (J.23) 
Try, = 0 (theorem 2) (J.24) 
Tr(odd number of ys) = 0 (theorem 3). (J.25) 
Proof 
Consider 
T = Tr(d fo- -- fp) (J.26) 
where n is odd. Now insert 1 = (y5)? into T, so that 
T = Tr(d 42 --- f, 1575). (J.27) 
Move the first ys to the front of T by repeatedly using the result 
dig ei (J.28) 
We therefore pick up n minus signs: 
T = Tr¢,...¢,) = (—1)°Tr(ys¢, ---¿, 15) 
= (—1)"Tr(d, ...d, 7575) (cyclic property of trace) 
—Tr(d,---4,,) for n odd. (J.29) 
Thus, for n odd, T must vanish: 
Tr(dp) = 4a - b (theorem 4). (J.30) 


Proof 


TAB) = FTCA + Bd) 
= 3ayb,Tr(1.291”) 
= 4a-b. 
Tr(db¢d) = 4[(a-b)(c-d)+(a-d)(b-c)—(a-c)(b-d). (theorem 5) 
(J.31) 
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Proof 


Tr(4bgd) = 2(a - b)Tr(¢d) — Tr(Bd ed) (J-32) 


using the result of (J.6). We continue taking d through the trace in this 
manner and use (J.30) to obtain 


Tr(Ab¢d) = 2(a- b)4(c- d) — 2(a > c)Te(Bd) + Tr(Bédd) 
= 8(a - b)(c - d) — 8(a-c)(b- d) + 8(b- c)(a- d) — Tr (pedd) (J.33) 
and, since we can bring ¢ to the front of the trace, we have proved the theorem. 
Tr[ys4] = 0. (theorem 6) (J.34) 
This is a special case of theorem 3 since y5 contains four y matrices. 


Tr[y5 4B] = 0. (theorem 7) (J.35) 


This is not so obvious; it may be proved by writing out all the possible products 
of y matrices that arise. 


Triys4pg] = 0. (theorem 8) (J.36) 
Again this is a special case of theorem 3. 
Tr[ysdb¢d] = 4ieagysa bP e dă. (theorem 9) (J.37) 


This theorem follows by looking at components: the € tensor just gives the 
correct sign of the permutation. 

The e tensor is the four-dimensional generalization of the three-dimensional 
antisymmetric tensor €;;,. In the three-dimensional case we have the well- 
known results 

(b x O); = EijkbjCk (J.38) 


and 
a: (b x c) = EijkaibjCk (J.39) 


for the triple scalar product. 
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Example of a Cross Section Calculation 


In this appendix we outline in more detail the calculation of the es? elastic 
scattering cross section in section 8.3.2. The standard factors for the unpo- 
larized cross section lead to the expression 


_ 1 
dg = apy? Me «+ (8, 8")|*dLips(s; k’, p") (K.1) 
1 


= aa Me- + (s, 8")2dLips(s; k’, p’) (K.2) 


using the result of problem 6.9, and the definition of Lorentz-invariant phase 
space: 
dp! de k! 


: SES fl n 4 c4/7/ O EN PISA 
dLips(s;k,p) = (27)"9(k' +p — k— p) (27)32E (273201 


(K.3) 


Instead of evaluating the matrix element and phase space integral in the CM 
frame, or writing the result in invariant form, we shall perform the calculation 
entirely in the laboratory” frame, defined as the frame in which the target (i.e. 
the s-particle) is at rest: 

= (M, 0) (K.4) 


where M is the s-particle mass. Let us look in some detail at the laboratory” 
frame kinematics for elastic scattering (figure K.1). Conservation of energy 
and momentum in the form 


2 
p"=(p+qY (K.5) 
allows us to eliminate p’ to obtain the elastic scattering condition 


2p-q+q2=0 (K.6) 


(K.7) 


if we introduce the positive quantity 
Q=-q (K.8) 
for a scattering process. 
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FIGURE K.1 
Laboratory frame kinematics. 


In all the applications with which we are concerned it will be a good 
approximation to neglect electron mass effects for high-energy electrons. We 
therefore set 


k2 =k? ~0 (K.9) 
so that 
s+t+u=2M? (K.10) 
where 
s = (k+p)?=(k' +p’) (K.11) 
= (k-k? =(p =p =e (K.12) 
u = (k-p} =(k — p} (K.13) 


are the usual Mandelstam variables. For the electron 4-vectors 
k” = (w,k) (K.14) 
k” = (w',k') (K.15) 


we can neglect the difference between the magnitude of the 3-momentum and 
the energy, 


w = |k|=k (K.16) 
w = |k'|=k' (K.17) 
and in this approximation 
q? = —2kk'(1 — cos 0) (K.18) 
or 
q? = —4kk' sin? (0/2). (K.19) 


The elastic scattering condition (K.7) gives the following relation between k, k’ 
and 0: 
(k/k') = 1+ (2k/M) sin*(0/2). (K.20) 


It is important to realize that this relation is only true for elastic scattering: 
for inclusive inelastic electron scattering k, k’ and 0 are independent variables. 
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The first element of the cross section, the flux factor, is easy to evaluate: 
Al(k - p)? —m?M?]? = 4Mk (K.21) 

in the approximation of neglecting the electron mass m. We now consider the 


calculation of the spin-averaged matrix element and the phase space integral 
in turn. 


K.1 The spin-averaged squared matrix element 


The Feynman rules for es scattering enable us to write the spin sum in the 
form 


1 dra Y 
A = (SE) tar (K.22) 
s,s! d 


where Lp, is the lepton tensor, 7” the s-particle tensor and the one-photon 
exchange approximation has been assumed. From problem 8.12 we find the 
result 

Luu T” = 8[2(k - p)(k' - p) + (q2/2)M?]. (K.23) 


In the ‘laboratory’ frame, neglecting the electron mass, this becomes 


Luu Th = 16M*kk’' cos? (0/2). (K.24) 


a o ooo B RO 


K.2 Evaluation of two-body Lorentz-invariant phase 
space in “laboratory” variables 

We must evaluate 

dp! d3k’ 

E w 


dLips(s; k’, p’) = 6*(k’ + p' —k—p) (K.25) 
in terms of “laboratory” variables. This is in fact rather tricky and requires 


some care. There are several ways it can be done: 


(i) Use CM variables, put the cross section into invariant form, and then 
translate to the ‘laboratory’ frame. This involves relating dq? to 
d(cos 0) which we shall do as an exercise at the end of this appendix. 


(ii) Alternatively, we can work directly in terms of ‘laboratory’ variables 
and write 


Pp /2E' = dtp 6(p'* — M?)6(p"). (K.26) 
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The four-dimensional d-function then removes the integration over d*p' 
leaving us only with an integration over the single d-function 5(p'? — 
M?), in which p’ is understood to be replaced by k+p—k’. For details 
of this last integration, see Bjorken and Drell (1964, p 114). 

(iii) We shall evaluate the phase space integral in a more direct manner. We 
begin by performing the integral over d*p” using the three-dimensional 
6-function from 0*(k' + p' — k — p). In the ‘laboratory’ frame p = 0, 
so we have 


fer UE +p — k)f(p',k',k) = f(p Rp (8-27) 


In the particular function f(p',k',k) that we require, p' only appears via E”, 
since 
E? =p” +M? (K.28) 


and 
? = k? +k? — 2kk' cos (K.29) 


(setting the electron mass m to zero). We now change dk’ to angular vari- 
ables: 

d3k! /w! ~ k'dk’dQ (K.30) 
leading to 


k! 
dLips(s; k’, p") = Ga? = dQ see ple +k’ —k—M). (K.31) 


Since F’ is a function of k’ and @ for a given k (cf (K.28) and (K.29)), the ô- 
function relates k’ and 6 as required for elastic scattering (cf (K.20)), but until 
the 6 function integration is performed they must be regarded as independent 
variables. We have the integral 


aa ae dk! —8(f(k', cos0)) (K.32) 


where 


f(k’, cos0) = [(k? + k? — 2kk' cos0) + M7]? +k’ —k —M (K.33) 


remaining to be evaluated. In order to obtain a differential cross section, 
we wish to integrate over k’; for this k’ integration we must regard cos@ in 
F(k',cos0) as a constant, and use the result (E.36): 


o ee 
where f(xo) = 0. The required derivative is 
4 = pe + k' — kcos0) (K.35) 


constant cos O 
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and the 6-function requires that k’ is determined from k and 0 by the elastic 
scattering condition 


k 


k = ——_—____,___ = }’ 0). K.36 
Lama e? 558) 
The integral (K.32) becomes 
l J soa i SiR — k! (cos 8)| (K.37) 
—, — ——— — k' (cos E 
(47)? E [df /dk'|k'=k' (cos 0) 
and, after some juggling, df/dk’ evaluated at k’ = k'(cos0) may be written 
as 
d Mk 
af an (K.38) 
dk k'=k' (cos 0) E'k 
Thus we obtain finally the result 
1 k? 
dLips(s; k’, p") = (K.39) 


ll) 
(47)? Mk i 


for two-body elastic scattering in terms of ‘laboratory’ variables, neglecting 
lepton masses. 
Putting all these elements together yields the advertised result 


do do a2 ko, 
Eon =I 4k sin4(0/2) k * (0/2). (K.40) 


As a final twist to this calculation let us consider the change of variables from 
dQ to dq? in this elastic scattering example. In the unpolarized case 


dQ = 2rd(cos0) (K.41) 
and 
q? = —2kk'(1 — cos 0) (K.42) 
where 
kis a A (K.43) 
~ 1+(2k/M)sin?(0/2)' 
Thus, since k’ and cos@ are not independent variables, we have 
dq? = 2kk' d(cos 0) + (1 — cos ot. d(cos 6) (K.44) 
ae d(cos 6) ` i 
From (K.20) we find 
dk’ k”? 
= (K.45) 


d(cos0) M 
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and, after some routine juggling, arrive at the result 
dq? = 2k!” d(cos0). 

If we introduce the variable v defined, for elastic scattering, by 
2p- q = 2Mv = —9? 


we have immediately 


k? 
dv = i d(cos0). 
Similarly, if we introduce the variable y defined by 
y=v/k 
we find á 
k! 
EM 


for elastic scattering. 


(K.46) 


(K.47) 


(K.48) 


(K.49) 


(K.50) 


L 


Feynman Rules for Tree Graphs in QED 


2 — 2 cross section formula 


1 
do = ——— 5 |_M|*dLips(s; p pa). 
Al(p1 - pa)? ime | ( 3» 4) 


1 > 2 decay formula 


1 A 
AT = — |M|?dLips(mj; pa, ps). 
2m, 


Note that for two identical particles in the final state an extra factor of > 
must be included in these formulae. 

The amplitude iM is the invariant matrix element for the process under 
consideration, and is given by the Feynman rules of the relevant theory. For 
particles with non-zero spin, unpolarized cross sections are formed by averag- 
ing over initial spin components and summing over final. 


<———= m ——————z:--- 


L.1 External particles 


Spin-> 
For each fermion or antifermion line entering the graph, include the spinor 
u(p, s) or v(p, s) (L.1) 
and for spin-4 particles leaving the graph the spinor 
u(p',s) or d(p',s). (L.2) 


Photons 


For each photon line entering the graph include a polarization vector 


En(k, A) (L.3) 
and for photons leaving the graph the vector 
e. (E, A). (L.4) 
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418 L. Feynman Rules for Tree Graphs in QED 


E oQOQOo————— ooo ———— 


L.2 Propagators 


Spin-0 
A A (L.5) 
p? — m? + ie 
Spin-3 
+ 
ee, (L.6) 
pom p-m’ +ie 
Photon 
LN O E eae ee L 
ampara El gr” +( Vie (L.7) 


for a general £. Calculations are usually performed in the Lorentz or Feynman 
gauge with € = 1 and photon propagator equal to 


og) 


E: SSe 


L.3 Vertices 
Spin-0 


L.3. Vertices 


Spin-+ 


p pP 


—iey, (for charge +e) 
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Analytic function, 388 
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Angular momentum, 362 
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Annihilation 

ee” into hadrons, 288-292 
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in CM frame, 257, 265-266 
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Antiparticles 
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Baryon spectroscopy, 8 
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in Fermi theory, 24 
Bhabha scattering, 343 
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under P, 97-98, 110 
under T, 107-108, 110 
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limit, 272 
scaling, 272-277 
x variable, 274-275 
Born approximation, 19, 172, 401-402 
Bose symmetry, 143 
Bottom quark, 9-11 
Breit (‘brick wall”) frame, 279 
Breit-Wigner amplitude, 173-174 


Callan—Gross relation, 277-280 
Casimir effect, 143 
Cats, conservation of, 43 
Cauchy—Reimann relations, 389 
Cauchy’s integral formula, 390, 398 
Cauchy’s theorem, 389 
Causality, 166, 191, 196 
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370 
Charge conjugation 
invariance, in electromagnetic inter- 
actions, 212 
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transformation C, 99 
and Dirac equation, 100-102 
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conservation 
global, 44, 47, 207 
local, 44, 47, 207 
definition, 246, 346 
effective, 304, 312 
quantization, 312 
screening, 339-340 
by the vacuum, 340-341 
Charged current process, 23 
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Charm, 9-11 
quark, 9 
Chiral perturbation theory, 13 
Chiral symmetry, 13 
spontaneously broken, 13, 33 
Cloud of virtual particles, 177 
Colour, 10, 27-29 
and R, 291 
in Drell-Yan process, 286 
Compactified space dimensions, 38-39 
Compensating field, 48, 206 
Completeness relation, for states, 159 
Compton effect, 13 
Compton scattering of e”, 250-254 
Condensate, 32 
Confinement, 9, 28-29, 289 
Conjugate variables, 138 
Conservative force, 387 
Constraints, 199 
Contact force, 12 
Continuity equation 
in covariant form, 46 
for electric charge density, 43-44, 46 
Contour integration, 314-316, 387-392 
Contravariant 4-vector, 62, 65, 71, 371 
Coulomb interaction, instantaneous, 239 
Coulomb scattering 


of s*, 221-225 
of s~, 225-226 
of e, 227-233 
of e”, 233-234 


Coulomb's law, 21, 323, 369, 393 
QED modifications to, 336-338 
Counter terms 320, 325 
determined by renormalization con- 
ditions, 320-322 
in ABC theory, 319-321 
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in Fermi theory, 355-356 
in QED, 328, 332, 349 
Coupling constant, 208, 299 
dimension, 31, 162, 353 
dimensionless, 208 
Higgs, 343 
running, 25, 33, 339-344 
Covariance, 
of Dirac equation under Lorentz 
transformation, 89-94 
of KG equation under Lorentz trans- 
formation, 87-89 
in special relativity, 371-375 
Covariant derivative, 52 
Covariant 4-vector, 371 
CP, 103-104 
violation, 9, 103-104 
in K and B decays, 102-104 
and Sakharov conditions, 104 
and T violation, 104 
CPT, 104, 108-109 
operator 6, 108 
tests of CPT invariance, 109 
theorem, 104, 108 
and equality of particle and 
antiparticle masses, 108-109, 
190-191 
transformation 0, 108 
Crossing symmetry, 247-250 
Cross section, differential 
for Compton scattering, 252-254, 
264 
for elastic ep scattering, 257-260 
for est — e st, 239-242, 411-416 
for eu > e pu, 254-257, 265, 
293 
for ete” — hadrons, 288-292 
for ete” > pt p`, 257, 265-266 
for ete” — mt, 288-289 
for ete” — qq, 289 
inclusive, 269, 276, 293 
for inelastic e” p scattering, 262, 269- 
280 
in laboratory frame, 241, 256, 271- 
272, 293-294, 411-416 
Mott, 229 
in natural units, 366-367 
in non-relativistic scattering theory, 
400 


Index 


no structure, 241—242, 256, 411-416 
Rosenbluth, 269 
two-body spinless, 174-177 
unpolarized, 228-229, 240 
for virtual photons, 277-280 
Hand convention, 278, 294 
longitudinal/scalar, 278-279 
transverse, 278-279 
Current 
axial vector, 98-99, 295 
conservation, 65, 71, 110, 186, 194, 
207, 236, 256, 334 
and form factors, 245-247, 259 
and gauge invariance, 47, 206-207 
and hadron tensor, 270 
used in evaluating contraction of 
tensors, 256 
-current form of matrix element, 237— 
238 
momentum space, 236 
operator, electromagnetic 4-vector 
Dirac, 207 
Klein—Gordon, 209, 225 
probability, see Probability current 
symmetry, see Symmetry, current 
transition, electromagnetic, 223, 225, 
228 
Cut-off, in renormalization, 303-304, 308, 
316-318, 321-323, 325, 330, 
332-333, 398 


D’Alembertian operator, 64 
Decay rate, 161, 417 
Deep inelastic region, 262 
Deep inelastic scattering, 262, 269-283 
scaling violations in, 273 
Density of final states, 161, 175 
Dielectric constant, 340, 369 
Dielectric, polarizable, 339-340 
Dipole moment, induced, 339-341 
Dirac 
algebra, 67, 407 
charge form factor, 259-260, 339 
delta function, 377-385 
properties of, 382-384 
equation, 66-71 
and C, 100-102 
for e” interacting with potential, 
81 


429 


4-current, 93 
free-particle solutions, 69-70, 84— 
85 
Lorentz covariance of, 89-94 
negative-energy solutions, 74-80 
and P, 95-98 
positive-energy solutions, 74-75 
probability current density, 70- 
71, 79-80, 91-93, 110 
probability density, 70-71, 79-80, 
91, 93 
in slash notation, 110, 228 
and spin, 67-69, 72-74 
and T, 105-108 
field, quantization of, 191-196 
Hamiltonian, 66, 69, 193-195 
interpretation of negative-energy so- 
lutions, 76-77 
Lagrangian, 191 
matrices, 67-69, 84, 110, 294 
propagator, 195-196 
sea, 76-77 
spinor, 68—70, 74-75, 89-106 
conjugate, 93, 110 
Lorentz transformation of, 89-94 
normalization, 86 
Discrete symmetry transformations, 95— 
110 
Displacement current, 44 
Divergence, 30, 178, 303 
infrared, 330-331 
of self energy 
in ABC theory, 303, 314-317 
of photon, in QED, 331-333 
ultraviolet, 303, 353, 357 
Drell-Yan process, 284-287, 289, 295- 
296 
scaling in, 286-287, 295-296 
Dyson expansion, 156-158, 210, 235, 299, 
301 


Effective low-energy theory, 358 

Effective theory, 32 

Electric dipole moments and T, 108 

Electromagnetic field, see Field, electro- 
magnetic 

Electromagnetic interactions, see Inter- 
actions, electromagnetic 


430 


Electromagnetic transition current, see 
Current, tansition, electromag- 
netic 

Electron Compton scattering, 250-254 

Electron, magnetic moment of, 80-83, 
348-352 

Electroweak theory, 4 

Energy-time uncertainty relation, 17-18 

Ether, 13 

Euler-Lagrange equations, 127, 135, 138, 
142, 144, 185, 216 

Exclusion principle, 192 


Faraday, and lines of force, 13 
Faraday—Lenz law, 43 
Fermi 
constant, 24, 354, 359 
dimensionality of, 24, 31, 354, 359 
related to W mass, 24, 357 
Fermionic fields, 191-196 
and spin-statistics connection, 191— 
195 
Feynman 
diagram 
connected, 179 
for counter terms, 320-321, 325, 
328 
description of, 20 
disconnected, 178 
gauge, 205 
identity, 314, 331 
le prescription, 169-172, 190, 196, 
397-398 
interpretation of negative-energy so- 
lutions, 77-80, 226, 247 
path-integral formulation of quan- 
tum mechanics, 130-131 
propagator, 165 
for complex scalar field, 189-190 
for Dirac Field, 195-196 
for photon, 200-205, 236 
for real scalar field, 165, 172 
scaling variable, 286, 296 
Feynman rules 
for ABC theory, 168-172, 178 
for loops, 178, 303 
for QED, 238-239, 251, 417-419 
Field, electromagnetic, 13 
quantization of, 199-205 
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Field strength renormalization, 308-311, 
328, 330, 334 
constant, 310 
Field theory, classical 
Lagrange-Hamilton approach, 133- 
137 
Field theory, quantum, see Quantum field 
theory 
Fine structure constant, 222, 299 
q”-dependent, 339-345 
Flavour 
lepton, 5-7 
quark, 10-11 
Flux factor, 174-176, 271, 399, 403 
for virtual photon, 278 
Form factor, electromagnetic, 244 
of nucleon, 259-260 
Dirac charge, 259 
electric, 260 
and invariance arguments, 258- 
259 
magnetic, 260 
Pauli anomalous magnetic mo- 
ment, 259 
q2-dependence, 260 
radiatively induced, 338-339 
of pion, 242-250 
and invariance arguments, 244— 
246 
in the time-like region, 247-250, 
288-289 
static, 244 
Form invariance, see Covariance 
Fourier series, 380-382 
4-momentum conservation, 160, 170 
4-vector, 371-375 
4-vector potential, electromagnetic, 46, 
48, 196-205 


Galilean transformation, 111 
“y matrices, 93, 110, 227, 407 
anticommutation relations, 407 
trace theorems, 409-410 
ys matrix, 98, 407-408 
Gauge 
bosons, of SM, 29-30 
choice of, 197 
and photon propagator, 205 
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covariance, in quantum mechanics, 
49-52 
covariant derivative, 52 
field, 55 
invariance, 22, 42-52, 198, 237 
and charge conservation, 47, 206— 
208 
in classical electromagnetism, 45— 
48 
in Compton scattering, 251-252, 
264 


as dynamical principle, 52-61 
and masslessness of photon, 22, 
25, 336 
and Maxwell equations, 45-48 
and photon polarization states, 
197-199 
of QED, 205, 237 
in quantum mechanics, 49-52 
and Schrödinger current, 51 
and Ward identity, 251-252 
parameter, 205 
physical results independent of, 
205 
principle, 52, 55-61, 206 
theories, 23, 25-26, 35, 41-48 
transformation, 45-47, 49 
and quantum mechanics, 49-52 
Gauss's divergence theorem, 393 
Gauss's law, 43, 369 
General relativity, 48 
Generations, 4, 9 


and anomalies, 11 
g factor, 81 
prediction of g = 2 from Dirac equa- 
tion, 81-83 


QED corrections to, 83, 348-352 
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Preface to Volume 2 of the Fourth 
Edition 


The main focus of the second volume of this fourth edition, as in the third, is 
on the two non-Abelian quantum gauge field theories of the Standard Model 
— that is, QCD and the electroweak theory of Glashow, Salam and Weinberg. 
We preserve the same division into four parts: non-Abelian symmetries, both 
global and local; QCD and the renormalization group; spontaneously broken 
symmetry; and weak interaction phenomenology and the electroweak theory. 

However, the book has always combined theoretical development with dis- 
cussion of relevant experimental results. And it is on the experimental side 
that most progress has been made in the ten years since the third edition 
appeared — first of all, in the study of CP violation in B-meson physics, and 
in neutrino oscillations. The inclusion of these results, and the increasing im- 
portance of the topics, have required some reorganization, and a new chapter 
(21) devoted wholly to them. We concentrate mainly on CP-violation in B- 
meson decays, particularly on the determination of the angles of the unitarity 
triangle from B-meson oscillations. CP-violation in K-meson systems is also 
discussed. In the neutrino sector, we describe some of the principal experi- 
ments which have led to our current knowledge of the mass-squared differences 
and the mixing angles. In discussing weak interaction phenomenology, we keep 
in view the possibility that neutrinos may turn out to be Majorana particles, 
an outcome for which we have prepared the reader in (new) chapters 4 and 7 
of volume 1. 

More recently, on July 4, 2012, the ATLAS and CMS collaborations at 
the CERN LHC announced the discovery of a boson of mass between 125 and 
126 GeV, with production and decay characteristics which are consistent (at 
the lo level) with those of the Standard Model Higgs boson. We can now 
conclude our treatment of the electroweak theory, and this volume, with a 
discussion of this historic discovery, which opens a new era in particle physics 
— one in which the electroweak symmetry-breaking (Higgs) sector of the SM 
will be rigorously tested. 

Our treatment of a number of topics has been updated and, we hope, im- 
proved. In QCD, the definition of 2-jet cross sections in ete” annihilation 
is explained, and used in a short discussion of jet algorithms (sections 14.5 
and 14.6). Progress in lattice QCD is recognized with the inclusion of some 
of the recent impressive results using dynamical fermions (section 16.5). In 
the chapter on chiral symmetry breaking, a new section (18.3) introduces the 


xiii 


xiv Preface 


important technique of effective Lagrangians, including the extension to the 
three-flavour case and the associated mass relations. A much fuller account is 
given of three-generation quark mixing and the CKM matrix (section 20.7.3), 
as preparation for chapter 21. The essential points in chapter 21 of the pre- 
vious edition, relating to problems with the current-current and IVB models, 
now provide the introductory motivation for the GSW theory in chapter 22. 

One item has been banished to an appendix: geometrical aspects of gauge 
theories, which did after all seem to interrupt the flow of chapter 13 too much 
(but we hope readers will not ignore it). And another has been brought in 
from the cold: as already mentioned, Majorana fermions now find themselves 
appearing for the first time in volume 1. 
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Global Non-Abelian Symmetries 


12.1 The Standard Model 


In the preceding volume, a very successful dynamical theory — QED — has been 
introduced, based on the remarkably simple gauge principle: namely, that the 
theory should be invariant under local phase transformations on the wave- 
functions (chapter 2) or field operators (chapter 7) of charged particles. Such 
transformations were characterized as Abelian in section 2.6, since the phase 
factors commuted. The second volume of this book will be largely concerned 
with the formulation and elementary application of the remaining two dynam- 
ical theories within the Standard Model — that is, QCD and the electroweak 
theory. They are built on a generalization of the gauge principle, in which the 
transformations involve more than one state, or field, at a time. In that case, 
the ‘phase factors’ become matrices, which generally do not commute with 
each other, and the associated symmetry is called a ‘non-Abelian’ one. When 
the phase factors are independent of the space-time coordinate x, the symme- 
try is a ‘global non-Abelian’ one; when they are allowed to depend on z, one 
is led to a non-Abelian gauge theory. Both QCD and the electroweak theory 
are of the latter type, providing generalizations of the Abelian U(1) gauge 
theory which is QED. It is a striking fact that all three dynamical theories in 
the Standard Model are based on a gauge principle of local phase invariance. 

In this chapter we shall be mainly concerned with two global non-Abelian 
symmetries, which lead to useful conservation laws but not to any specific 
dynamical theory. We begin in section 12.1 with the first non-Abelian sym- 
metry to be used in particle physics, the hadronic isospin ‘SU(2) symmetry’ 
proposed by Heisenberg (1932) in the context of nuclear physics, and now 
understood as following from QCD and the smallness of the u and d quark 
masses as compared with the QCD scale parameter Ayg (see section 18.3.3). 
In section 12.2 we extend this to SU(3)f flavour symmetry, as was first done 
by Gell-Mann (1961) and Ne’eman (1961) — an extension seen, in its turn, as 
reflecting the smallness of the u, d and s quark masses as compared with Ag. 
The ‘wavefunction’ approach of sections 12.1 and 12.2 is then reformulated in 
field-theoretic language in section 12.3. 

In the last section of this chapter, we shall introduce the idea of a global 
chiral symmetry, which is a symmetry of theories with massless fermions. This 
may be expected to be a good approximate symmetry for the u and d quarks. 
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But the anticipated observable consequences of this symmetry (for example, 
nucleon parity doublets) appear to be absent. This puzzle will be resolved 
in Part VII, via the profoundly important concept of ‘spontaneous symmetry 
breaking’. 

The formalism introduced in this chapter for SU(2) and SU(3) will be 
required again in the following one, when we consider the local versions of 
these non-Abelian symmetries and the associated dynamical gauge theories. 
The whole modern development of non-Abelian gauge theories began with 
the attempt by Yang and Mills (1954) (see also Shaw 1955) to make hadronic 
isospin into a local symmetry. However, the beautiful formalism developed 
by these authors turned out not to describe interactions between hadrons. 
Instead, it describes the interactions between the constituents of the hadrons, 
namely quarks — and this in two respects. First, a local SU(3) symmetry 
(called SU(3).) governs the strong interactions of quarks, binding them into 
hadrons (see Part VI). Secondly, a local SU(2) symmetry (called weak isospin) 
governs the weak interactions of quarks (and leptons); together with QED, this 
constitutes the electroweak theory (see Part VIII). It is important to realize 
that, despite the fact that each of these two local symmetries is based on 
the same group as one of the earlier global (flavour) symmetries, the physics 
involved is completely different. In the case of the strong quark interactions, 
the SU(3). group refers to a new degree of freedom (‘colour’) which is quite 
distinct from flavour u, d, s (see chapter 14). In the weak interaction case, 
since the group is an SU(2), it is natural to use ‘isospin language’ in talking 
about it, particularly since flavour degrees of freedom are involved. But we 
must always remember that it is weak isospin, which (as we shall see in chapter 
20) is an attribute of leptons as well as of quarks, and hence physically quite 
distinct from hadronic isospin. Furthermore, it is a parity-violating chiral 
gauge theory. 

Despite the attractive conceptual unity associated with the gauge prin- 
ciple, the way in which each of QCD and the electroweak theory ‘works’ is 
actually quite different from QED, and from each other. Indeed it is worth 
emphasizing very strongly that it is, a priori, far from obvious why either the 
strong interactions between quarks, or the weak interactions, should have any- 
thing to do with gauge theories at all. Just as in the U(1) (electromagnetic) 
case, gauge invariance forbids a mass term in the Lagrangian for non-Abelian 
gauge fields, as we shall see in chapter 13. Thus it would seem that gauge 
field quanta are necessarily massless. But this, in turn, would imply that the 
associated forces must have a long-range (Coulombic) part, due to exchange of 
these massless quanta — and of course in neither the strong nor the weak inter- 
action case is that what is observed.! As regards the former, the gluon quanta 
are indeed massless, but the contradiction is resolved by non-perturbative ef- 
fects which lead to confinement, as we indicated in chapter 1. We shall discuss 


1Pauli had independently developed the theory of non-Abelian gauge fields during 1953, 
but did not publish any of this work because of the seeming physical irrelevancy associated 
with the masslessness problem (Enz 2002, pages 474-82; Pais 2000, pages 242-5). 
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this further in chapter 16. In weak interactions, a third realization appears: 
the gauge quanta acquire mass via (it is believed) a second instance of spon- 
taneous symmetry breaking, as will be explained in Part VII. In fact a further 
application of this idea is required in the electroweak theory, because of the 
chiral nature of the gauge symmetry in this case: the quark and lepton masses 
also must be ‘spontaneously generated’. 


| oo c o — cc ——————— ooo ———— 
12.2 The flavour symmetry SU(2)¢ 
12.2.1 The nucleon isospin doublet and the group SU(2) 


The transformations initially considered in connection with the gauge principle 
in section 2.5 were just global phase transformations on a single wavefunction 


ap = ly. (12.1) 
The generalization to non-Abelian invariances comes when we take the sim- 
ple step — but one with many ramifications — of considering more than one 
wavefunction, or state, at a time. Quite generally in quantum mechanics, we 
know that whenever we have a set of states which are degenerate in energy (or 
mass) there is no unique way of specifying the states: any linear combination 
of some initially chosen set of states will do just as well, provided the normal- 
ization conditions on the states are still satisfied. Consider, for example, the 
simplest case of just two such states — to be specific, the neutron and proton 
(figure 12.1). This single near coincidence of the masses was enough to suggest 
to Heisenberg (1932) that, as far as the strong nuclear forces were concerned 
(electromagnetism being negligible by comparison), the two states could be 
regarded as truly degenerate, so that any arbitrary linear combination of neu- 
tron and proton wavefunctions would be entirely equivalent, as far as this 
force was concerned, for a single ‘neutron’ or single ‘proton’ wavefunction. 
This hypothesis became known as ‘charge independence of nuclear forces’. 
Thus redefinitions of neutron and proton wavefunctions could be allowed, of 
the form 
Wp > Y, = ap + Bla (12.2) 
Yn > Pa = Vp + On (12.3) 
for complex coefficients a, 6, y, and 6. In particular, since Yp and 4, are 
degenerate, we have 


Hp = Ep, Ayn = En (12.4) 
from which it follows that 


Hy, = Hap + Bln) = Hp, + BH Yn (12.5) 
= Eladp + Bla) = Ev, (12.6) 
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FIGURE 12.1 
Early evidence for isospin symmetry. 


and similarly 
Hy, = Eva (12.7) 


showing that the redefined wavefunctions still describe two states with the 
same energy degeneracy. 

me two-fold degeneracy seen in figure 12.1 is suggestive of that found in 
spin-4 systems in the absence of any magnetic field; the s, = +3 components 
are Sescierace: The analogy can be brought out by introducing the two- 
component nucleon isospinor 


po) = = | A ) = Xp + Van (12.8) 


melas ela) (12.9) 


In y2, Wp is the amplitude for the nucleon to have ‘isospin up’, and Y is 
that for it to have ‘isospin down’. 

As far as the states are concerned, this terminology arises, of course, from 
the formal identity between the ‘isospinors’ of (12.9) and the two-component 
eigenvectors (3.60) corresponding to eigenvalues +$h of (true) spin: compare 
also (3.61) and (12.8). It is important to be clear, however, that the degrees of 
freedom involved in the two cases are quite distinct; in pai even though 
both the proton and the neutron have (true) spin—5, the transformations 
(12.2) and (12.3) leave the (true) spin part of their pa otil! tone completely 
untouched. Indeed, we are suppressing the spinor part of both wavefunctions 
altogether (they are of course 4-component Dirac spinors). As we proceed, 
the precise mathematical nature of this ‘spin-1/2’ analogy will become clear. 

Equations (12.2) and (12.3) can be compactly written in terms of 4p(/2) 
as 


where 


pl? y (UY Vp. ve ( a a ) (12.10) 
where V is the indicated complex 2 x 2 matrix. Heisenberg's proposal, then, 
was that the physics of strong interactions between nucleons remained the 
same under the transformation (12.10): in other words, a symmetry was in- 
volved. We must emphasise that such a symmetry can only be exact in the 
absence of electromagnetic interactions: it is therefore an intrinsically approx- 
imate symmetry, though presumably quite a useful one in view of the relative 
weakness of electromagnetic interactions as compared to hadronic ones. 
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We now consider the general form of the matrix V, as constrained by 
various relevant restrictions: quite remarkably, we shall discover that (after 
extracting an overall phase) V has essentially the same mathematical form 
as the matrix U of (4.33), which we encountered in the discussion of the 
transformation of (real) spin wavefunctions under rotations of the (real) space 
axes. It will be instructive to see how the present discussion leads to the same 
form (4.33). 

We first note that V of (12.10) depends on four arbitrary complex numbers, 
or alternatively on eight real parameters. By contrast, the matrix U of (4.33) 
depends on only three real parameters, which we may think of in terms of two 
to describe the direction of the axis of rotation, and a third for the angle of 
rotation. However, V is subject to certain restrictions, and these reduce the 
number of free parameters in V to three, as we now discuss. First, in order 
to preserve the normalization of (1/2) we require 


YEDIR = ydy yY =p 1/2 (1/2) (12.11) 
which implies that V has to be unitary: 
VIV = 1, (12.12) 


where 12 is the unit 2 x 2 matrix. Clearly this unitarity property is in no 
way restricted to the case of two states: the transformation coefficients for 
n degenerate states will form the entries of an n x n unitary matrix. A 
trivialization is the case n = 1, for which, as we noted in section 2.6, V reduces 
to a single phase factor as in (12.1), indicating how all the previous work is 
going to be contained as a special case of these more general transformations. 
Indeed, from elementary properties of determinants we have 


det V'V = detVi . detV = det V* - detV =| det V |?= 1 (12.13) 


so that 
detV = exp(10) (12.14) 


where 6 is a real number. We can separate off such an overall phase factor from 
the transformations mixing ‘p’ and ‘n’, because it corresponds to a rotation 
of the phase of both p and n wavefunctions by the same amount: 


Wy = ei“, Ya = pr. (12.15) 


The V corresponding to (12.15) is V = ei“12, which has determinant exp(2ia) 
and is therefore of the form (12.1) with 9 = 2a. In the field-theoretic formalism 
of section 7.2, such a symmetry can be shown to lead to the conservation of 
baryon number Nua + Na — Na — Ng, where bar denotes the antiparticle. 

The new physics will lie in the remaining transformations which satisfy 


detV = +1. (12.16) 
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Such a matrix is said to be a special unitary matrix, which simply means it 
has unit determinant. Thus, finally, the V’s we are dealing with are special, 
unitary, 2 x 2 matrices. The set of all such matrices form a group. The 
general defining properties of a group are given in appendix M. In the present 
case, the elements of the group are all such 2 x 2 matrices, and the ‘law of 
combination’ is just ordinary matrix multiplication. It is straightforward to 
verify (problem 12.1) that all the defining properties are satisfied here; the 
group is called ‘SU(2)’, the ‘S’ standing for ‘special’, the ‘U’ for ‘unitary’, and 
the ‘2’ for ‘2 x 2’. 

SU(2) is actually an example of a Lie group (see appendix M). Such groups 
have the important property that their physical consequences may be found 
by considering ‘infinitesimal’ transformations, that is — in this case — matrices 
V which differ only slightly from the ‘no-change’ situation corresponding to 
V = 12. For such an infinitesimal SU(2) matrix Ving we may therefore write 


Vina = 12 +1€ (12.17) 
where € is a 2 x 2 matrix whose entries are all first-order small quantities. The 
condition det Ving = 1 now reduces, on neglect of second-order terms 0(€?), 
to the condition (see problem 12.2) 

Tré = 0. (12.18) 
The condition that Ving be unitary, i.e. 
(12 +18) (12 — i£) = 15 (12.19) 
similarly reduces (in first order) to the condition 
e=ei. (12.20) 


Thus € is a 2 x 2 traceless Hermitian matrix, which means it must have the 


form 
¿= ( pas e ) (12.21) 
where a,b,c are infinitesimal real parameters. Writing 
a=e3/2, b=e1/2, c=e/2, (12.22) 
(12.21) can be put in the more suggestive form 
€=e-7/2 (12.23) 
where e stands for the three real quantities 


E= (€1, €2, €3) (12.24) 
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which are all first-order small. The three matrices 7 are just the familiar 
Hermitian Pauli matrices 


0 1 0 —i 1 0 
a=(; 0) (| a-l a =) (12.25) 


here called ‘tau’ precisely in order to distinguish them from the mathemati- 
cally identical ‘sigma’ matrices which are associated with the real spin degree 
of freedom. Hence a general infinitesimal SU(2) matrix takes the form 


Vina = (12 +ie- 7/2), (12.26) 


and an infinitesimal SU(2) transformation of the p-n doublet is specified by 


( i ) a ( 2 i, (12.27) 


The T-matrices clearly play an important role, since they determine the 
forms of the three independent infinitesimal SU(2) transformations. They are 
called the generators of infinitesimal SU(2) transformations; more precisely, 
the matrices 7/2 provide a particular matriz representation of the generators, 
namely the two-dimensional, or ‘fundamental’ one (see appendix M). We note 
that they do not commute amongst themselves: rather, introducing TG) = 
7/2, we find (see problem 12.3) 

ED Aaa, (12.28) 
where i,j and k run from 1 to 3, and a sum on the repeated index k is 
understood as usual. The reader will recognize the commutation relations 
(12.28) as being precisely the same as those of angular momentum operators 
in quantum mechanics: 


PARA = ic; jr. (12.29) 


In that case, the choice J; = 0,/2 = JE would correspond to a (real) spin- 
1/2 system. Here the identity between the tau’s and the sigma’s gives us a 
good reason to regard our ‘p-n’ system as formally analogous to a ‘spin-1/2’ 
one. Of course, the ‘analogy’ was made into a mathematical identity by the 
judicious way in which € was parametrised in (12.23). 
The form for a finite SU(2) transformation V may then be obtained from 
the infinitesimal form using the result 
e = lim (1+ A/n)” (12.30) 
n—>00 
generalized to matrices. Let € = a/n, where a = (a, 42,3) are three real 
finite (not infinitesimal) parameters, apply the infinitesimal transformation n 
times, and let n tend to infinity. We obtain 


V =exp(ia: 7/2) (12.31) 
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so that 


P / 
pay = ( ) = exp(ia - 7/2) ( ie ) = exp(ia - 7/2)p(/2. (12.32) 
Note that in the finite transformation, the generators appear in the exponent. 
Indeed, (12.31) has the form 


V = exp(iG) (12.33) 
where G = a: 7/2, from which the unitary property of V easily follows: 
Vi = exp(-iG") = exp(-iG) = V7! (12.34) 


where we used the Hermiticity of the tau’s. Equation (12.33) has the general 
form 
unitary matrix = exp(i Hermitian matrix) (12.35) 


where the ‘Hermitian matrix’ is composed of the generators and the trans- 
formation parameters. We shall meet generalizations of this structure in the 
following sub-section for SU(2), again in section 12.2 for SU(3), and a field 
theoretic version of it in section 12.3. 

As promised, (12.32) has essentially the same mathematical form as (4.33). 
In each case, three real parameters appear. In (4.33) they describe the axis 
and angle of a physical rotation in real three-dimensional space: we can always 
write a = |a|& and identify |a| with the angle 0 and @ with the axis n of 
the rotation. In (12.32) there are just the three parameters in a.” 

In the form (12.32), it is clear that our 2 x 2 isospin transformation is a 
generalization of the global phase transformation of (12.1), except that: 


(i) there are now three ‘phase angles’ a; 


(ii) there are non-commuting matrix operators (the 7’s) appearing in the ex- 
ponent. 


The last fact is the reason for the description ‘non-Abelian’ phase invariance. 
As the commutation relations for the 7 matrices show, SU(2) is a non-Abelian 
group in that two SU(2) transformations do not in general commute. By con- 
trast, in the case of electric charge or particle number, successive transforma- 
tions clearly commute: this corresponds to an Abelian phase invariance and, 
as noted in section 2.6, to an Abelian U(1) group. 

We may now put our initial ‘spin-1/2’ analogy on a more precise mathe- 
matical footing. In quantum mechanics, states within a degenerate multiplet 
may conveniently be characterized by the eigenvalues of a complete set of Her- 
mitian operators which commute with the Hamiltonian and with each other. 


It is not obvious that the general SU(2) matrix can be parametrized by an angle 0 
with 0 < 6 < 27, and ñ: for further discussion of the relation between SU(2) and the 
three-dimensional rotation group, see appendix M, section M.7. 
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In the case of the p-n doublet, it is easy to see what these operators are. We 
may write (12.4), (12.6) and (12.7) as 


Hay) = Ey? (12.36) 


and , f 
Hyp = EYD, (12.37) 


where Ho is the 2 x 2 matrix 


n= ( i i: (12.38) 


Hence Hə is proportional to the unit matrix in this two-dimensional space, 
and it therefore commutes with the tau’s: 


[H2,7] = 0. (12.39) 
It then also follows that H commutes with V, or equivalently 
VHV! = Hy (12.40) 


which is the statement that Ha is invariant under the transformation (12.32). 
Now the tau’s are Hermitian, and hence correspond to possible observables. 
Equation (12.39) implies that their eigenvalues are constants of the motion 
(i.e. conserved quantities), associated with the invariance (12.40). But the 
tau’s do not commute amongst themselves and so according to the general 
principles of quantum mechanics we cannot give definite values to more than 
one of them at a time. The problem of finding a classification of the states 
which makes the maximum use of (12.39), given the commutation relations 
(12.28), is easily solved by making use of the formal identity between the 
operators 7;/2 and angular momentum operators J; (cf (12.29)). The answer 
is? that the total squared ‘spin’ 


a cee | 3 
(TUD = (or) == +73 473) = =l (12.41) 
2 4 4 
and one component of spin, say T gae 473, can be given definite values 


simultaneously. The corresponding eigenfunctions are just the yp’s and xn’s 
of (12.9), which satisfy 


1 1 
qT Xp = 7X TIX = 5Xp (12.42) 
ics 3 1 1 
17 Xn = ăn» 773Xn = —3Xn: (12.43) 


The reason for the ‘spin’ part of the name ‘isospin’ should by now be clear; 
the term is actually a shortened version of the historical one “isotopic spin”. 


3See for example Mandl (1992). 
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In concluding this section we remark that, in this two-dimensional n-p 
space, the electromagnetric charge operator is represented by the matrix 


1 0 1 
It is clear that although Qem commutes with 73, it does not commute with 
either 7, or 72. Thus, as we would expect, electromagnetic corrections to the 
strong interaction Hamiltonian will violate SU(2) symmetry. 


12.2.2 Larger (higher-dimensional) multiplets of SU(2) in 
nuclear physics 


For the single nucleon states considered so far, the foregoing is really nothing 
more than the general quantum mechanics of a two-state system, phrased in 
‘spin-1/2’ language. The real power of the isospin (SU(2)) symmetry concept 
becomes more apparent when we consider states of several nucleons. For 
A nucleons in the nucleus, we introduce three ‘total isospin operators’ T = 
(Th, To, T3) via 
1 1 1 
TS 370) + 370 +...+ TA) (12.45) 
which are Hermitian. Here T(n) is the T-matrix for the nth nucleon. The 
Hamiltonian H describing the strong interactions of this system is presumed 
to be invariant under the transformation (12.40) for all the nucleons indepen- 
dently. It then follows that 
[H,T] =0. (12.46) 


Thus the eigenvalues of the T operators are constants of the motion. Further, 
since the isospin operators for different nucleons commute with each other 
(they are quite independent), the commutation relations (12.28) for each of 
the individual 7's imply (see problem 12.4) that the components of T defined 
by (12.45) satisfy the commutation relations 


[T,, Tj] = ici Tr (12.47) 


for i,j,k = 1,2,3, which are simply the standard angular momentum com- 
mutation relations, once more. Thus the energy levels of nuclei ought to be 
characterized — after allowance for electromagnetic effects, and correcting for 
the slight neutron-proton mass difference — by the eigenvalues of T? and Ts, 
say, which can be simultaneously diagonalized along with A. These eigenval- 
ues should then be, to a good approximation, ‘good quantum numbers’ for 
nuclei, if the assumed isospin invariance is true. 

What are the possible eigenvalues? We know that the T's are Hermitian 
and satisfy exactly the same commutation relations (12.47) as the angular 
momentum operators. These conditions are all that are needed to show that 


the eigenvalues of T? are of the form T(T'+1), where T = 0, 3, 1,..., and that 
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FIGURE 12.2 

Energy levels (adjusted for Coulomb energy and neutron-proton mass differ- 
ences) of nuclei of the same mass number but different charge, showing (a) 
“mirror” doublets, (b) triplets and (c) doublets and quartets. 


for a given T the eigenvalues of T3 are -T,—T+1,..., T—1,T; that is, there 
are 2T + 1 degenerate states for a given T. These states all have the same 
A value, and since T3 counts +4 for every proton and —3 for every neutron, 
it is clear that successive values of T3 correspond physically to changing one 
neutron into a proton or vice versa. Thus we expect to see ‘charge multiplets’ 
of levels in neighbouring nuclear isobars. These are indeed observed; figure 
12.2 shows some examples. These level schemes (which have been adjusted 
for Coulomb energy differences, and for the neutron-proton mass difference), 
provide clear evidence of T = 3 (doublet), T = 1 (triplet) and T = 3 (quartet) 
multiplets. It is important to note that states in the same T-multiplet must 
have the same JP quantum numbers (these are indicated on the levels for 
18 F); obviously the nuclear forces will depend on the space and spin degrees of 
freedom of the nucleons, and will only be the same between different nucleons 
if the space-spin part of the wavefunction is the same. 
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Thus the assumed invariance of the nucleon-nucleon force produces a richer 
nuclear multiplet structure, going beyond the original n-p doublet. These 
higher-dimensional multiplets (T = 1, 3, ...) are called ‘irreducible represen- 
tations’ of SU(2). The commutation relations (12.47) are called the Lie algebra 
of SU(2)* (see appendix M), and the general group theoretical problem of un- 
derstanding all possible multiplets for SU(2) is equivalent to the problem of 
finding matrices which satisfy these commutation relations. These are, in fact, 
precisely the angular momentum matrices of dimension (27 + 1) x (2T + 1) 
which are generalizations of the 7 /2’s, which themselves correspond to T = 2, 


as indicated in the notation T2). For example, the T = 1 matrices are 3 x 3 
and can be compactly summarised by (problem 12.5) 


(TD) yy = —iéijk (12.48) 


a 


where the numbers —ie;;, are deliberately chosen to be the same numbers 
(with a minus sign) that specify the algebra in (12.47); the latter are called the 
structure constants of the SU(2) group (see appendix M, sections M.3-M.5). In 
general there will be matrices TÚ? of dimensionality (27 +1) x (2T +1) which 
satisfy (12.47), and correspondingly (2T + 1)-dimensional wavefunctions y 7) 
analogous to the two-dimensional (T = 4) case of (12.8). The generalization 
of (12.32) to these higher-dimensional multiplets is then 


y = exp(ia TD YO, (12.49) 


which has the general form of (12.35). In this case, the matrices TÚ”? provide 
a (2T + 1)-dimensional matrix representation of the generators of SU(2). We 
shall meet field-theoretic representations of the generators in section 12.3. 

We now proceed to consider isospin in our primary area of interest, which 
is particle physics. 


12.2.3 Isospin in particle physics: flavour SU(2)f 


The neutron and proton states themselves are actually only the ground states 
of a whole series of corresponding B = 1 levels with isospin 4 (i.e. doublets). 
Another series of baryonic levels comes in four charge states, corresponding 
to T = 3 and in the meson sector, the 7’s appear as the lowest states of a 
sequence of mesonic triplets (T = 1). Many other examples also exist, but 
with one remarkable difference as compared to the nuclear physics case: no 
baryon states are known with T > 3, nor any meson states with T > 1. 

The most natural interpretation of these facts is that the observed states 
are composites of more basic entities which carry different charges but are 
nearly degenerate in mass, while the forces between these entities are charge- 


independent, just as in the nuclear (p,n) case. These entities are, of course, 


‘Likewise, the angular momentum commutation relations (12.29) are the Lie algebra of 
the rotation group SO(3). The Lie algebras of the two groups are therefore the same. For 
an indication of how, nevertheless, the groups do differ, see appendix M, section M.7. 
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the quarks: the n contains (udd), the p is (uud), and the A-quartet is (uuu, 
uud, udd, ddd). The u-d isospin doublet plays the role of the p-n doublet in 
the nuclear case, and this degree of freedom is what we now call SU(2) isospin 
flavour symmetry at the quark level, denoted by SU(2)f. We shall denote the 
u-d quark doublet wavefunction by 


q= ( ; ) (12.50) 


omitting now the explicit representation label (3), and shortening ‘Yu’ to 


just ‘u’, and similarly for ‘d’. Then, under an SU(2)f transformation, 
q > qd = Vq = explia - T/2)q. (12.51) 


The limitation T < 3 for baryonic states can be understood in terms of their 
being composed of three T = 4 constituents (two of them pair to T = 1 or 
T = 0, and the third adds to T = 1 to make T = 3 or T = 2, and to T = 0 to 
make T = 2, by the usual angular momentum addition rules). It is, however, 
a challenge for QCD to explain why, for example, states with four or five 
quarks should not exist (nor states of one or two quarks!), and why a state 
of six quarks, for example, appears as the deuteron, which is a loosely bound 
state of n and p, rather than as a compact B = 2 analogue of the n and p 
themselves. 

Meson states such as the pion are formed from a quark and an antiquark, 
and it is therefore appropriate at this point to explain how antiparticles are 
described in isospin terms. An antiparticle is characterized by having the 
signs of all its additively conserved quantum numbers reversed, relative to 
those of the corresponding particle. Thus if a u-quark has B E, T 2, T3 


a ū-quark has B = —3,T = 4,73 i —1. Similarly, the d has B = 


1 
“27 = 4 and T3 = 2. Note that, while T3 is an additively conserved 
quantum number, the magnitude of the isospin is not additively conserved: 
rather, it is ‘vectorially’ conserved according to the rules of combining angular- 
momentum-like quantum numbers, as we have seen. Thus the antiquarks d 
and u form the T3 = +4 and T; = —4 members of an SU(2)¢ doublet, just as 
u and d themselves do, and the question arises: given that the (u, d) doublet 
transforms as in (12.51), how does the (a, d) doublet transform? 

The answer is that antiparticles are assigned to the complex conjugate of 
the representation to which the corresponding particles belong. Thus identi- 


fying u = u* and d= d* we have? 


q" = Vid", or ( E ) = exp(—ia- 7*/2) ( E ) (12.52) 


for the SU(2)f transformation law of the antiquark doublet. In mathemati- 
cal terms, this means (compare (12.32)) that the three matrices —¿T* must 


5The overbar (u etc.) here stands only for ‘antiparticle’, and has nothing to do with the 
Dirac conjugate Y introduced in section 4.4. 
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represent the generators of SU(2)f in the 2* representation (i.e. the complex 
conjugate of the original two-dimensional representation, which we will now 
call 2). Referring to (12.25), we see that TË = 71,73 = —T2 and 73 = 73. It is 
then easy to check that the three matrices —7,/2, +72/2 and —73/2 do indeed 
satisfy the required commutation relations (12.28), and thus provide a valid 
matrix representation of the SU(2) generators. Also, since the third compo- 
nent of isospin is here represented by —73/2 = —73/2, the desired reversal in 
sign of the additively conserved eigenvalue does occur. 

Although the quark doublet (u, d) and antiquark doublet (ū, d) do trans- 
form differently under SU(2)f transformations, there is nevertheless a sense 
in which the 2* and 2 representations are somehow the ‘same’: after all, the 
quantum numbers T = 4,73 = +3 describe them both. In fact, the two 
representations are ‘unitarily equivalent”, in that we can find a unitary matrix 
Uc such that 


Uc exp(—ia : 7*/2)UG* = explia - 7/2). (12.53) 


This requirement is easier to disentangle if we consider infinitesimal transfor- 
mations, for which (12.53) becomes 


Uo(-7*)UG! =, (12.54) 


or 
Ucr Uzt =T UcnUg' = Ta, Ucr Uzt = —T3. (12.55) 


Bearing the commutation relations (12.28) in mind, and the fact that 7, * = 7;, 
it is clear that we can choose Vo proportional to 72, and set 


Dom ( a ; ) (12.56) 


to obtain a convenient unitary form. From (12.52) and (12.53) we obtain 
(Ucq*”) = V(Ucq*), which implies that the doublet 


Uc | z ) z ( T ) (12.57) 


transforms in exactly the same way as (u,d). This result is useful, because 
it means that we can use the familiar tables of (Clebsch-Gordan) angular 
momentum coupling coefficients for combining quark and antiquark states 
together, provided we include the relative minus sign between the d and ū 
components which has appeared in (12.57). Note that, as expected, the d is 
in the 73 = +4 position, and the w is in the T3 = —4 position. 

As an application of these results, let us compare the T = 0 combination 
of the p and n states to form the (isoscalar) deuteron, and the combination 


of (u,d) and (u,d) states to form the isoscalar w-meson. In the first, the 
isospin part of the wavefunction is (pon — Ya Yp), corresponding to the 


S = 0 combination of two spin-3 particles in quantum mechanics given by 
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75 (11) N)— 11) |t)). But in the second case the corresponding wavefunction is 


ya (dd — (-t)u) = 75 (da + uu). Similarly, the T = 1 T3 = 0 state describing 
the 7° is 75 (da + (—t)u) = 75 (dd — uu). 

There is a very convenient alternative way of obtaining these wavefunc- 
tions, which we include here because it generalizes straightforwardly to SU(3); 
its advantage is that it avoids the use of the explicit C-G coupling coefficients, 
and of their (more complicated) analogues in SU(3). 

Bearing in mind the identifications u = u*,d = d*, we see that the T = 
0 dq combination wu + dd can be written as u*u + d*d which is just qtq, 
(recall that Y means transpose and complex conjugate). Under an SU(2), 


transformation, q > q! = Vq, so qi — qi = qi Vi and 

gq qtd =qViVq=a'a (12.58) 
using VV = 1; thus q'q is indeed an SU(2)f invariant, which means it has 
T = 0 (no multiplet partners). 


We may also construct the T = 1 q — states in a similar way. Consider 
the three quantities v; defined by 


u=aqing i=1,2,3. (12.59) 
Under an infinitesimal SU(2)f transformation 
g =(l2+ie-7/2)q, (12.60) 
the three quantities v; transform to 
ul = q (12 —ie-7/2)7,(12 + ie - 7/2), (12.61) 


where we have used q = q' (12 + ie- 7/2) and then ri = 7. Retaining only 
the first-order terms in € gives (problem 12.6) 


uv, = vi + ¡20 (ren, = TjTi)q (12.62) 


where the sum on j = 1,2,3 is understood. But from (12.28) we know the 
commutator of two 7's, so that (12.62) becomes 


Y = Ut ig! diera (sum on k = 1,2,3) 


= Vi — Cigk FT TRY 
= Vi — CijkEjUk» (12.63) 
which may also be written in ‘vector’ notation as 

w=v-exw. (12.64) 


Equation (12.63) states that, under an (infinitesimal) SU(2)f transforma- 
tion, the three quantities v; (i = 1,2,3) transform into specific linear combi- 
nations of themselves, as determined by the coefficients €;¿1 (the e's are just 
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the parameters of the infinitesimal transformation). This is precisely what is 
needed for a set of quantities to form the basis for a representation. In this 
case, it is the T = 1 representation as we can guess from the multiplicity of 
three, but we can also directly verify it, as follows. Equation (12.49) with 
T = 1, together with (12.48), tell us how a T = 1 triplet should transform: 
namely, under an infinitesimal transformation (with 13 the unit 3 x 3 matrix), 


we” = (Ig +ie- TO) (sum on k= 1,2,3) 
(13 + iesT) eo? (sum on j = 1, 2,3) 
= (ik + ie; (Te Jin oe? 
(Six + iej. — iejix)¥C? using (12.48) 


= ya = EEG Y using the antisymmetry of €;;, (12.65) 


which is exactly the same as (12.63). 

The reader who has worked through problem 4.2(a) will recognize the 
exact analogy between the T = 1 transformation law (12.64) for the isospin 
bilinear q7q, and the 3-vector transformation law (cf (4.9)) for the Pauli 
spinor bilinear too. 

Returning to the physics of v;, inserting (12.50) into (12.59) we find ex- 
plicitly 

vı = ūd + du, v2 = —itd+idu, v3 = tu — dd. (12.66) 
Apart from the normalization factor of Ta v3 may therefore be identified with 
the T3 = 0 member of the T = 1 triplet, having the quantum numbers of the 
70. Neither vı nor v2 has a definite value of Tz, however: rather, we need to 
consider the linear combinations 


1 
z + iv2) = ud T3 =-1 (12.67) 


and F 
z —ive)=du T3=+1 (12.68) 


which have the quantum numbers of the m~ and at. The use of vı + ive 
here is precisely analogous to the use of the ‘spherical basis’ wavefunctions 
x+iy = rsineti* for l = 1 states in quantum mechanics, rather than the 
‘Cartesian’ ones x and y. 

We are now ready to proceed to SU(3). 


E a 


12.3 Flavour SU(3)¢ 


Larger hadronic multiplets also exist, in which strange particles are grouped 
with non-strange ones. Gell-Mann (1961) and Ne’eman (1961) (see also Gell- 
Mann and Ne’eman 1964) were the first to propose SU(3) as the correct 
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generalization of isospin SU(2) to include strangeness. Like SU(2), SU(3) 
is a group whose elements are matrices — in this case, unitary 3 x 3 ones, 
of unit determinant. The general group-theoretic analysis of SU(3) is quite 
complicated, but is fortunately not necessary for the physical applications we 
require. We can, in fact, develop all the results needed by mimicking the steps 
followed for SU(2). 

We start by finding the general form of an SU(3) matrix. Such matrices 
obviously act on 3-component column vectors, the generalization of the 2- 
component isospinors of SU(2). In more physical terms, we regard the three 
quark wavefunctions u,d and s as being approximately degenerate, and we 
consider unitary 3 x 3 transformations among them via 


qd = Wea (12.69) 
where q now stands for the 3-component column vector 


u 
q=| d (12.70) 
S 


and W is a 3 x 3 unitary matrix of determinant 1 (again, an overall phase 
has been extracted). The representation provided by this triplet of states 
is called the ‘fundamental’ representation of SU(3) (just as the isospinor 
representation is the fundamental one of SU(2)f). 

To determine the general form of an SU(3) matrix W, we follow exactly 
the same steps as in the SU(2) case. An infinitesimal SU(3) matrix has the 
form 

Winn = 13 +ix (12.71) 


where x is a 3 x 3 traceless Hermitian matrix. Such a matrix involves eight 
independent parameters (problem (12.7)) and can be written as 


x=n:A/2 (12.72) 


where y = (n1,...,1s) and the A's are eight matrices generalizing the T ma- 
trices of (12.25). They are the generators of SU(3) in the three-dimensional 
fundamental representation, and their commutation relations define the alge- 
bra of SU(3) (compare (12.28) for SU(2)): 


[Aa /2, Ap /2] => ifabcAc/2, (12.73) 


where a,b and c run from 1 to 8. 

The A-matrices (often called the Gell-Mann matrices), are given in ap- 
pendix M, along with the SU(3) structure constants ifabc; the constants fabe 
are all real. 

A finite SU(3) transformation on the quark triplet is then (cf (12.32)) 


g = explia - A/2)q, (12.74) 
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which also has the ‘generalized phase transformation’ character of (12.35), now 
with eight ‘phase angles’. Thus W is parametrized as W = exp(ia - A/2). 

As in the case of SU(2)f, exact symmetry under SU(3)f would imply that 
the three states u, d and s were degenerate in mass. Actually, of course, this 
is not the case: in particular, while the u and d quark masses are of order 1-5 
MeV, the s quark mass is greater, of order 100 MeV. Nevertheless it is still 
possible to regard this as relatively small on a typical hadronic mass scale, so 
we may proceed to explore the physical consequences of this (approximate) 
SU(3)f flavour symmetry. 

Such a symmetry implies that the eigenvalues of the A's are constants 
of the motion, but because of the commutation relations (12.73) not all of 
these operators have simultaneous eigenstates. This happened for SU(2) too, 
but there the very close analogy with SO(3) told us how the states were 
to be correctly classified, by the eigenvalues of the relevant complete set of 
mutually commuting operators. Here it is more involved — for a start, there 
are 8 matrices Aa. A glance at appendix M, section M.4.5, shows that two of 
the As are diagonal (in the chosen representation), namely Az and Ag. This 
means physically that for SU(3) there are two additively conserved quantum 
numbers, which in this case are of course the third component of hadronic 
isospin (since Az is simply 73 bordered by zeros), and a quantity related to 
strangeness. Defining the hadronic hypercharge Y by Y = B+ S, where B is 
the baryon number (3 for each quark) and the strangeness values are S(u) = 
S(d) = 0, S(s) = —1, we find that the physically required eigenvalues imply 
that the matrix representing the hypercharge operator is yY) = Fas: in this 


fundamental (three-dimensional) representation, denoted by the symbol 3. 


Identifyin q = +) 3 then gives the Gell-Mann-Nishijima relation Q = 
ying 13 2 8 


T3 + Y /2 for the quark charges in units of | e |. 

So Az and Ag are analogous to 73; what about the analogue of 7?, which 
is diagonalizable simultaneously with 73 in the case of SU(2)? Indeed, (cf 
(12.41)) 7? is a multiple of the 2 x 2 unit matrix. In just the same way one 
finds that A? is also proportional to the unit matrix: 


8 
(0/2)? = Sef)? = 215, (12.75) 
a=1 3 

as can be verified from the explicit forms of the A-matrices given in appendix 
M, section M.4.5. Thus we may characterize the fundamental triplet’ (12.70) 
by the eigenvalues of (A/2)?, Az and Ag. The conventional way of representing 
this pictorially is to plot the states in a Y — T3 diagram, as shown in figure 
12.3. 

We may now consider other representations of SU(3)f. The first impor- 
tant one is that to which the antiquarks belong. If we denote the fundamental 
three-dimensional representation accommodating the quarks by 3, then the 
antiquarks have quantum numbers appropriate to the ‘complex conjugate’ of 
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FIGURE 12.3 
The Y — 73 quantum numbers of the fundamental triplet 3 of quarks, and of 
the antitriplet 3* of antiquarks. 


this representation, denoted by 3* just as in the SU(2) case. The q wavefunc- 
tions identified as u = u*,d = d* and 5 = s*, then transform by 


7 


= W*¿= exp(—ia- A" /2)a (12.76) 


QL 
II 
wa A, Sl 


instead of by (12.74). As for the 2* representation of SU(2), (12.76) means 
that the eight quantities —A*/2 represent the SU(3) generators in this 3* 
representation. Referring to appendix M, section M.4.5, one quickly sees 
that A3 and Ag are real, so that the eigenvalues of the physical observables 
re ) = —A3/2 and Y8 = —yqrs/2 (in this representation) are reversed 
relative to those in the 3, as expected for antiparticles. The îi, d and 3 states 
may also be plotted on the Y — 73 diagram, figure 12.3, as shown. 

Here is already one important difference between SU(3) and SU(2): the 
fundamental SU(3) representation 3 and its complex conjugate 3* are not 
equivalent. This follows immediately from figure 12.3, where it is clear that 
the extra quantum number Y distinguishes the two representations. 

Larger SU(3)f representations can be created by combining quarks and 
antiquarks, as in SU(2)f. For our present purposes, an important one is the 
eight-dimensional (‘octet’) representation which appears when one combines 
the 3* and 3 representations, in a way which is very analogous to the three- 
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dimensional (‘triplet’) representation obtained by combining the 2* and 2 
representations of SU(2). 

Consider first the quantity wu + dd + 5s. As in the SU(2) case, this can 
be written equivalently as qq, which is invariant under q + q" = Wq since 
W'W = 13. So this combination is an SU(3) singlet. The octet coupling is 
formed by a straightforward generalization of the SU(2) triplet coupling q'rq 
of (12.59), 

we=qraq a =1,2,...8. (12.77) 


Under an infinitesimal SU(3)f transformation (compare (12.61) and (12.62)), 


q' (13 — in - A/2)Aa(13 + în - A/2)q 
aq + ig (dads — Avda) (12.78) 


II 


1 
Wa > Wa 


Q 


where the sum on b = 1 to 8 is understood. Using (12.73) for the commutator 
of two A's we find 
wl, = Wa + iva! 2i favcàcq (12.79) 


or 
/ 


Wa = Wa — fabdeMWe (12.80) 
which may usefully be compared with (12.63). Just as in the SU(2)f triplet 
case, equation (12.80) shows that, under an SU(3)f transformation, the eight 
quantities wa(a = 1,2,...8) transform into specific linear combinations of 
themselves, as determined by the coefficients fabe (the 7’s are just the param- 
eters of the infinitesimal transformation). 

This is, again, precisely what is needed for a set of quantities to form the 
basis for a representation — in this case, an eight-dimensional representation 
of SU(3)r. For a finite SU(3)f transformation, we can ‘exponentiate’ (12.80) 
to obtain 

w = exp(ia-G®))w (12.81) 


where w is an 8-component column vector 


wi 
w2 

w= ! (12.82) 
Ws 


such that wa = q'Aaq, and where (cf (12.49) for SU(2))r) the quantities G®) = 
(GS. Gs), ae Ge) are 8 x 8 matrices, acting on the 8-component vector w, 


and forming an 8-dimensional representation of the algebra of SU(3): that is 
to say, the G()’s satisfy (cf (12.73)) 


6, cP] = ifareG®. (12.83) 
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FIGURE 12.4 
The Y — T; quantum numbers of the pseudoscalar meson octet. 


The actual form of the Go matrices is given by comparing the infinitesimal 
version of (12.81) with (12.80) 


(A), = itm (1254) 


as may be checked in problem 12.8, where it is also verified that the matrices 
specified by (12.84) do obey the commutation relations (12.83). 

As in the SU(2); case, the 8 states generated by the combinations q? Aaq 
are not necessarily the ones with the physically desired quantum numbers. To 
get the r+, for example, we again need to form (w + iw2)/2. Similarly, wa 
produces us + su and ws the combination —itis + i 5u, so the K* states are 
wa + iws. Similarly the KO, KO states are we — iw7, and we + iwy, while the 
n (in this simple model) would be wg ~ (wu + dd — 28s), which is orthogonal 
to both the 7° state and the SU(3)f singlet. In this way all the pseudoscalar 
octet of 7-partners has been identified, as shown on the Y — T diagram of 
figure 12.4. We say ‘octet of 7-partners’, but a reader knowing the masses 
of these particles might well query why we should feel justified in regarding 
them as (even approximately) degenerate. By contrast, a similar octet of 
vector (JP17) mesons (the w,p,K* and K*) are all much closer in mass, 
averaging around 800 MeV; in these states the qq spins add to S = 1, while 
the orbital angular momentum is still zero. The pion, and to a much lesser 
extent the kaons, seem to be ‘anomalously light’ for some reason: we shall 
learn the likely explanation for this in chapter 15. 

There is a deep similarity between (12.84) and (12.48). In both cases, a 
representation has been found in which the matrix element of a generator is 
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minus the corresponding structure constant. Such a representation is always 
possible for a Lie group, and is called the adjoint, or regular, representation 
(see appendix M, section M.5). These representations are of particular im- 
portance in gauge theories, as we will see, since gauge quanta always belong 
to the adjoint representation of the gauged group (for example, the 8 gluons 
in SU(3).). 

Further flavours c, b and t of course exist, but the mass differences are now 
so large that it is generally not useful to think about higher flavour groups 
such as SU(4)f etc. Instead, we now move on to consider the field-theoretic 
formulation of global SU(2)f and SU(3)f. 


E a 


12.4 Non-Abelian global symmetries in Lagrangian 
quantum field theory 


12.4.1 SU(2)f and SU(3)¢ 


As may already have begun to be apparent in chapter 7, Lagrangian quantum 
field theory is a formalism which is especially well adapted for the description 
of symmetries. Without going into any elaborate general theory, we shall now 
give a few examples showing how global flavour symmetry is very easily built 
into a Lagrangian, generalizing in a simple way the global U(1) symmetries 
considered in section 7.1 and section 7.2. This will also prepare the way for 
the (local) gauge case, to be considered in the following chapter. 
Consider, for example, the Lagrangian 


£ = (i 9—m)it+dip—md (12.85) 


describing two free fermions ‘u’ and ‘d’ of equal mass m, with the overbar 
now meaning the Dirac conjugate for the four-component spinor fields. Note 
carefully that we are suppressing the space-time arguments of the quantum 
fields G(x), d(x). As in (12.50), we are using the convenient shorthand wy = â 


and Va = d. Let us introduce 
Ga 
â = ( j ) (12.86) 


so that £ can be compactly written as 
Ê= Gi 9 —m)G. (12.87) 


In this form it is obvious that £ — and hence the associated Hamiltonian A — 
is invariant under the global U(1) transformation 


gj = ei“g (12.88) 
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(cf (12.1)) which is associated with baryon number conservation. It is also 
invariant under global SU(2)s transformations acting in the flavour u-d space 
(cf (12.32)): 

q = exp(—ia - 7/2) (12.89) 


(for the change in sign with respect to (12.51), compare section 7.1 and section 
7.2 in the U(1) case). In (12.89), the three parameters a are independent of 
Ts 

What are the conserved quantities associated with the invariance of je 
under (12.89) ? Let us recall the discussion of the simpler U(1) cases studied 
in sections 7.1 and 7.2. Considering the complex scalar field of section 7.1, the 
analogue of (12.89) was just $ > ¢' =e~'*¢, and the conserved quantity was 
the Hermitian operator Ng which appeared in the exponent of the unitary 
operator U that effected the transformation $ > $ via 


¢ =UdU'," (12.90) 
with ñ A 
U =expliaN¿). (12.91) 
For an infinitesimal a, we have 
$ =(1-id0, Ur itieNg, (12.92) 
so that (12.90) becomes 
(1 —ie)d = (1 + ieNy)A(1 — ieN yg) = ¿+ ie Ny, 0); (12.93) 
hence we require a : 
We, e] = —¢ (12.94) 


for consistency. Insofar as No determines the form of an infinitesimal version 
of the unitary transformation operator U , it seems reasonable to call it the 
generator of these global U(1) transformations (compare the discussion after 
(12.27) and (12.35), but note that here Ng is a quantum field operator, not a 
matrix). 

Consider now the SU(2)f transformation (12.89), in the infinitesimal case: 


7 = (1—ie- 7/24. (12.95) 


Since the single U(1) parameter € is now replaced by the three parameters 
€ = (€1, €2,€3), we shall need three analogues of Ny, which we call 


«.(3) 


PD ph) pb) 
TY" = (PPRP, TY”), (12.96) 


corresponding to the three independent infinitesimal SU(2) transformations. 
The generalizations of (12.90) and (12.91) are then 


ad =U@ gut (12.97) 
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and 


A 1 


me A 
where the Ts are Hermitian, so that U(2) is unitary (cf (12.35)). It would 
„(a 
seem reasonable in this case too to regard the Ts as providing a field 
theoretic representation of the generators of SU(2)f, an interpretation we shall 


shortly confirm. In the infinitesimal case, (12.97) and (12.98) become 


(3) (3) 


wie 


(l—ie-r/2)@=(1+ie-T? âA -ie T”), (12.99) 


(4 
using, the Hermiticity of the Ts, Expanding the right-hand side of (12.99) 
to first order in e, and equating coefficients of e on both sides, (12.99) reduces 
to (problem 12.9) 


ve 

dep, (12.100) 

which is the analogue of (12.94). Equation (12.100) expresses a very specific 
a 


commutation property of the operators PP., which turns out to be satisfied 
by the expression 

1 

(3) = f t/a (12.101) 
as can be checked (problem 12.10) from the anticommutation relations of 
the fermionic fields in ĝ. We shall derive (12.101) from Noether’s theorem 
(Noether 1918) in a little while. Note that if ‘7/2’ is replaced by 1, (12.101) 
reduces to the sum of the u and d number operators, as required for the one- 
parameter U(1) case. The ‘g'rq’ combination is precisely the field-theoretic 
version of the qrq coupling we discussed in section 12.1.3. It means that the 

1 


three operators qe themselves belong to a T = 1 triplet of SU(2). 
(3) 


It is possible to verify that these T`? s do indeed commute with the 


Hamiltonian H: i 
3 


ar dt =-iT"*", 4] =0 (12.102) 
me 

so that their eigenvalues are conserved. That the Tt? are, as already sug- 

gested, a field theoretic representation of the generators of SU(2), appropriate 

to the case T = 2, follows from the fact that they obey the SU(2) algebra 


(problem 12.11): 

i stj = Chip > . 
For many purposes it is more useful to consider the raising and lowering 
operators 


TO = POL TD) (12.104) 


For example, we easily find 


m9 = jaa Be, (12.105) 
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which destroys a d quark and creates a u, or destroys a îi and creates a d, in 


re 
either case raising the qa) eigenvalue by +1, since 


7 = z faa - dde (12.106) 
which counts +4 for each u (or d) and —3 for each d (or ū). Thus these 
operators certainly ‘do the job’ expected of field theoretic isospin operators, 
in this isospin-1/2 case. 

In the U(1) case, considering now the fermionic example of section 7.2 for 
variety, we could go further and associate the conserved operator Ny with a 
conserved current Ni : 


Ny = Juez, Ñi = yea (12.107) 


where : 
ON =0. (12.108) 


The obvious generalization appropriate to (12.101) is 
ES E E 2 
p = PU pO = învă. (12.109) 


î E 
Note that both N“ and pO are of course functions of the space-time co- 


ordinate x, via the (suppressed) dependence of the ĝ-fields on x. Indeed one 
can verify from the equations of motion that 


=0. (12.110) 
1 
Thus pee is a conserved isospin current operator appropriate to the T = 4 
(u, d) system; it transforms as a 4-vector under Lorentz transformations, and 
as a T = 1 triplet under SU(2)f transformations. 

Clearly there should be some general formalism for dealing with all this 
more efficiently, and it is provided by a generalization of the steps followed, 
in the U(1) case, in equations (7.6)-(7.8). Suppose the Lagrangian involves 
a set of fields 4, (they could be bosons or fermions) and suppose that it is 
invariant under the infinitesimal transformation 


Sr E ~ieT rots (12.111) 


for some set of numerical coefficients Tys. Equation (12.111) generalizes (7.5). 
Then since £ is invariant under this change, 
s OL.» aL . 
0 = ôL = — dy, + ———— 0" (pr). (12.112) 
Or OOM, ) 
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But _ i 
El = 0 aa (12.113) 
Or oS) 
from the equations of motion. Hence 
gu e =0 (12.114) 
O(O" br) 


which is precisely a current conservation law of the form 
93, = 0. (12.115) 


Indeed, disregarding the irrelevant constant small parameter e, the conserved 
current is 


j= le EN (12.116) 
O(OM,) 


Let us try this out on (12.87) with 
6g = (ie: 7/2. (12.117) 


As we know already, there are now three e’s, and so three T;,,’s, namely 


a (Tea, 4(T2)rs5 4(T3)rs- For each one we have a current, for example 


[al Dy 
îs Song) 2 â = buză (12.118) 
and similarly for the other 7's, and so we recover (12.109). From the invari- 
ance of the Lagrangian under the transformation (12.117) there follows the 
conservation of an associated symmetry current. This is the quantum field 
theory version of Noether’s theorem. 

This theorem is of fundamental significance as it tells us how to relate 
symmetries (under transformations of the general form (12.111)) to ‘current’ 
conservation laws (of the form (12.115), and it constructs the actual currents 
for us. In gauge theories, the dynamics is generated from a symmetry, in 
the sense that (as we have seen in the local U(1) of electromagnetism) the 
symmetry currents are the dynamical currents that drive the equations for 
the force field. Thus the symmetries of the Lagrangian are basic to gauge 
field theories. 

Let us look at another example, this time involving spin-0 fields. Suppose 
we have three spin-0 fields all with the same mass, and take 


A 1 A A 1 > > 1 > A 1 x R A 
L= 3 10" + 301920" b2 + 30103003 = z (pi +43 + 63). (12.119) 


It is obvious that £ is invariant under an arbitrary rotation of the three Q's 
among themselves, generalizing the ‘rotation about the 3-axis’ considered for 
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the $1 — dg system of section 7.1. An infinitesimal such rotation is (cf (12.64), 
and noting the sign change in the field theory case) 


p =p+ex o (12.120) 
which implies Ă 
fbr = ie TY }., (12.121) 
with 
TH) = —icars (12.122) 


as in (12.48). There are of course three conserved T operators again, and three 


A „(1 (1 

Ts, which we call i and T' de respectively, since we are now dealing with 
a T = l isospin case. The a = 1 component of the conserved current in this 
case is, from (12.116), 


PO = ¿ads — b30"bo. (12.123) 


Cyclic permutations give us the other components which can be summarised 
as 


TO" = (ge TD Ong) — (My PO $M) (12.124) 
where we have written E 
A or 
oY) =| de (12.125) 
03 


and ‘ denotes transpose. Equation (12.124) has the form expected of a 
bosonic spin-0 current, but with the matrices TÚ) appearing, appropriate 
to the T = 1 (triplet) representation of SU(2)f. 

The general form of such SU(2) currents should now be clear. For an 
isospin T'-multiplet of bosons we shall have the form 


(DITO HoT) — (HAM TD JM) (12.126) 


where we have put the f to allow for possibly complex fields; and for an isospin 
T-multiplet of fermions we shall have 


aj Dep) iD) (12.127) 


where in each case the (27 + 1) components of ¢ or 1 transforms as a T- 
multiplet under SU(2), i.e. 


POY = exp(ia - TO) GO (12.128) 


and similarly for oD, where TÚ) are the 2T +1x2T+1 matrices representing 
the generators of SU(2)f in this representation. In all cases, the integral over 
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all space of the u = 0 component of these currents results in a triplet of isospin 
operators obeying the SU(2) algebra (12.47), as in (12.103). 
The cases considered so far have all been free field theories, but SU(2)- 
invariant interactions can be easily formed. For example, the interaction 
1 


gure - @ describes SU(2)-invariant interactions between a T = 3 isospinor 


(spin- 7) field o, and a T = 1 isotriplet (Lorentz scalar) $. An effective inter- 


action between pions and nucleons could take the form InbT 5 -Ó, allowing 
for the pseudoscalar nature of the pions (we shall see in the following section 
that Lys is a pseudoscalar, so the product is a true scalar as is required for a 
parity-conserving strong interaction). In these examples the ‘vector’ analogy 
for the T = 1 states allows us to see that the ‘dot product’ will be invariant. 
A similar dot product occurs in the interaction between the isospinor ya) 
and the weak SU(2) gauge field W,,, which has the form 

T 


EN W, (12.129) 


gq" 
as will be discussed in the following chapter. This is just the SU(2) dot product 
of the symmetry current (12.109) and the gauge field triplet, both of which 
are in the adjoint (T = 1) representation of SU(2). 
All of the foregoing can be generalized straightforwardly to SU(3)f. For 
example, the Lagrangian 
£ = Gi P-má (12.130) 


with q now extended to 


(12.131) 


> 
II 
w Q; & 


describes free u, d and s quarks of equal mass m. Lis clearly invariant under 
global SU(3)f transformations 


å! = exp(—ia- X/2)4, (12.132) 


as well as the usual global U(1) transformation associated with quark number 
conservation. The associated Noether currents are (in somewhat informal 
notation) 


A = uan 
Gu = ay a=1,2,...8 (12.133) 


(note that there are eight of them), and the associated conserved ‘charge 
operators’ are 


Gu SES a=1,2,...8, (12.134) 
which obey the SU(3) commutation relations 


(GO, 69] = if. (12.135) 
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SU(3)-invariant interactions can also be formed. A particularly impor- 
tant one is the ‘SU(3) dot-product of two octets (the analogues of the SU(2) 
triplets), which arises in the quark-gluon vertex of QCD (see chapters 13 and 
14): 


. a Aa a fa 
—igs d dr GA; (12.136) 
f 


In (12.136), dr stands for the SU(3). colour triplet 
d= | fo (12.137) 


where if? is any of the six quark flavour fields û, d, é, 8,8, b, and As are the 
8 (a = 1,2,...8) gluon fields. Once again, (12.136) has the form ‘symmetry 
current - gauge field’ characteristic of all gauge interactions. 


12.4.2 Chiral symmetry 


As our final example of a global non-Abelian symmetry, we shall introduce 
the idea of chiral symmetry, which is an exact symmetry for fermions in the 
limit in which their masses may be neglected. We have seen that the u and 
d quarks have indeed very small masses (< 5 MeV) on hadronic scales, and 
even the s quark mass (~ 100 MeV) is relatively small. Thus we may certainly 
expect some physical signs of the symmetry associated with my = ma = 0, 
and possibly also of the larger symmetry holding when mu 2 ma = ms = 0. 
As we shall see, however, this expectation leads to a puzzle, the resolution of 
which will have to be postponed until the concept of ‘spontaneous symmetry 
breaking’ has been developed in Part VII. 

We begin with the simplest case of just one fermion. Since we are interested 
in the ‘small mass’ regime, it is sensible to use the representation (3.40) of 
the Dirac matrices, in which the momentum part of the Dirac Hamiltonian is 
‘diagonal’ and the mass appears as an ‘off-diagonal’ coupling: 


o 0 0 1 
asii E s-a we (12.138) 
Writing the general Dirac spinor w as 
a= ( 9 i (12.139) 
x 


we have (as in (4.14), (4.15)) 


Ef = o-po+my (12.140) 
Ex (12.141) 


l 

l 
q 
8 
> 
+ 
3 
© 
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We now recall the matrix ys introduced in section 4.2.1 


ys =iy! y 8, (12.142) 


Ys = ( : a ) (12.143) 


in this representation. The matrix y5 plays a prominent role in chiral symme- 
try, as we shall see. Its defining property is that it anticommutes with the 7" 
matrices: 


which takes the form 


Ls, 14) = 0. (12.144) 


‘Chirality’ means ‘handedness’, from the Greek word for hand, xeip. Its 
use here stems from the fact that, in the limit m — 0 the 2-component spinors 
$, x become helicity eigenstates (cf problem 9.4), having definite ‘handedness’. 
As m > 0 we have E > |p|, and (12.140) and (12.141) reduce to 


(o :p/lpl)$ (12.145) 
(o: p/p) = —X, (12.146) 


so that the limiting spinor $ has positive helicity, and X negative helicity (cf 
(3.68) and (3.69)). In this m — 0 limit, the two helicity spinors are decoupled, 
reflecting the fact that no Lorentz transformation can reverse the helicity of 
a massless particle. Also in this limit, the Dirac energy operator is 


[op 0 
a- p= ( E PSA (12.147) 


| 
S 


which is easily seen to commute with y5. Thus the massless states may equiva- 
lently be classified by the eigenvalues of y5, which are clearly +1 since y? = T. 
Consider then a massless fermion with positive helicity. It is described 


by the ‘u’-spinor ( A 


which is an eigenstate of ys with eigenvalue +1. 


Similarly, a fermion with negative helicity is described by ( a ) which has 


ys = —1. Thus for these states chirality equals helicity. We have to be more 
careful for antifermions, however. A physical antifermion of energy E and 
momentum p is described by a ‘v’- spinor corresponding to —E and —p; but 
with m = 0 in (12.140) and (12.141) the equations for ¢ and x remain the 
same for —E,—p as for E,p. Consider the spin, however. If the physical 
antiparticle has positive helicity, with p along the oe say, then s; = +5. 


The corresponding v-spinor must then have s, = —3 (see section 3.4.3) and 


must therefore be of ¥ type (12.146). So the v-spinor for this antifermion of 
positive helicity is ( ) which has yz = —1. In summary, for fermions the 


y5 eigenvalue is equal to the helicity, and for antifermions it is equal to minus 
the helicity. It is the ys eigenvalue that is called the ‘chirality’. 
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In the massless limit, the chirality of $ and x is a good quantum number 
(ys commuting with the energy operator), and we may say that ‘chirality is 
conserved’ in this massless limit. On the other hand, the massive spinor w is 
clearly not an eigenstate of chirality: 


E ( E ) á a( E E (12.148) 


Referring to (12.140) and (12.141), we may therefore regard the mass terms 
as ‘coupling the states of different chirality’. 

It is usual to introduce operators PRL = (5) which ‘project’ out states 
of definite chirality from w: 


1 1 — 
v=( 8) ot (SB) os Prw t Pw son te, (12.149) 
so that 
if E 0 Pr oo if 0 
m=(5 9)(2)=(0) (0). azs 
Then clearly yswR = wr and su = —wy; slightly confusingly, the notation 


‘R’, ‘L’ is used for the chirality eigenvalue. 
We now reformulate the above in field-theoretic terms. The Dirac La- 
grangian for a single massless fermion is 


Lo = di ad. (12.151) 


This is invariant not only under the now familiar global U(1) transformation 
wv’ = ei), but also under the ‘global chiral U(1)’ transformation 


bay =e My (12.152) 


where 6 is an arbitrary (x-independent) real parameter. The invariance is 
easily verified: using (50,5) = 0 we have 


ap = Pity? = ptei0rs,0 = Pt yer io5 = spe iors. (12.153) 


and then using (y*, ys} = 0, 


Oul = peu | h 
Wyte dey) 
= "Aub (12.154) 


as required. The corresponding Noether current is 


II 


În = dys, (12.155) 
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and the spatial integral of its 4 = 0 component is the (conserved) chirality 
operator 


Qs imita f(x) Pe. (12.156) 


We denote this chiral U(1) by U(1)s. 

It is interesting to compare the form of Os with that of the corresponding 
operator foie in the non-chiral case (cf (7.51)). The difference has to 
do with their behaviour under a transformation already discussed in section 
4.2.1, namely parity. Under the parity transformation p — —p and thus, for 
(12.140) and (12.141) to be covariant under parity, we require ¢ > x, X > 9; 
this will ensure (as we saw in section 4.2.1) that the Dirac equation in the 
parity-transformed frame will be consistent with the one in the original frame. 
In the representation (12.138), this is equivalent to saying that the spinor wp 
in the parity-transformed frame is given by 


wp = w. (12.157) 


which implies dp = x, xp = 0. F A 

All this carries over to the field theory case, with wp(a,t) = y°w(—a, t), 
as we saw in section 7.5.1. Consider then the operator Qs in the parity- 
transformed frame: 


Che J E E a / ft Ca, 167° (—a, td 
= - | ttut dy = -Qs5 (12.158) 


where we used (0,5! = 0 and (7°)? = 1, and changed the integration 
variable to y = —a. Hence Qs is a ‘pseudoscalar’ operator, meaning that 
it changes sign in the parity-transformed frame. We can also see this di- 
rectly from (12.156), making the interchange do x. In contrast, the non- 
chiral operator f vivaz is a (true) scalar, remaining the same in the parity- 
transformed frame. 

In a similar way, the appearance of the y5 in the current operator JE = 
hoty affects its parity properties: for example, the u = 0 component Vis 
is a pseudoscalar, as we have seen. Problem 4.4(b) showed that the spatial 
parts Vis behave as an axial vector rather than a normal (polar) vector 
under parity: that is, they behave like r x p for example, rather than like 
r, in that they do not reverse sign under parity. Such a current is referred 
to generally as an ‘axial vector current’, as opposed to the ordinary vector 
currents with no ys. 

As a consequence of (12.158), the operator Qs changes the parity of any 
state on which it acts. We can see this formally by introducing the (unitary) 
parity operator Ê in field theory, such that states of definite parity |+), |—) 
satisfy 


P|+) =|+), P|-) =-|-). (12.159) 
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Equation (12.158) then implies that PQ;P~! = —Qs, following the normal 
rule for operator transformations in quantum mechanics. Consider now the 
state Q5]+). We have 


PQs|+) = (PQsP=") Pl+) 
= —Qs|+) (12.160) 


showing that Qs|+) is an eigenstate of P with the opposite eigenvalue, -1. 

A very important physical consequence now follows from the fact that (in 
this simple m = 0 model) Ox is a symmetry operator commuting with the 
Hamiltonian H. We have 


A Âs) = QsH|W) = Eds|w). (12.161) 


Hence for every state |Y) with energy eigenvalue E, there should exist a state 
Qs|) with the same eigenvalue E and the opposite parity: that is, chiral 
symmetry apparently implies the existence of ‘parity doublets’. 

Of course, it may reasonably be objected that all of the above refers not 
only to the massless, but also the non-interacting case. However, this is just 
where the analysis begins to get interesting. Suppose we allow the fermion 
field 1) to interact with a U(1)-gauge field A“ via the standard electromagnetic 
coupling _ 

Lint = qY YAn. (12.162) 


Remarkably enough, ae is also invariant under the chiral transformation 
(12.152), for the simple reason that the ‘Dirac’ structure of (12.162) is exactly 
the same as that of the free kinetic term j Dw: the ‘covariant derivative’ 
prescription ô” — DY = ð! + iq! automatically means that any ‘Dirac’ (e.g. 
5) symmetry of the kinetic part will be preserved when the gauge interaction 
is included. Thus chirality remains a ‘good symmetry’ in the presence of a 
U(1) gauge interaction. 

The generalization of this to the more physical mu ~ ma = 0 case is quite 
straightforward. The Lagrangian (12.87) becomes 


Ê= ĝi Jå (12.163) 
as m > 0, which is invariant under the ys-version of (12.89),° namely 
q = exp(-18 - 7 /275)4. (12.164) 


There are three associated Noether currents (compare (12.109)) 
sli PTE 
= 411574 (12.165) 


6 Lo is also invariant under 4! = e-i0%54 which is an ‘axial’ version of the global U(1) 
associated with quark number conservation. We shall discuss this additional U(1)-symmetry 
in section 18.1.1. 
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which are axial vectors, and three associated ‘charge’ operators 
„(a 
TO - Júntita (12.166) 


which are pseudoscalars, belonging to the T=1 representation of SU(2). We 
have a new non-Abelian global symmetry, called chiral SU(2)£, which we shall 
denote by SU(2)¢5. As far as their action in the isospinor u-d space is con- 
cerned, these chiral charges have exactly the same effect as the ordinary flavour 
isospin operators of (12.109). But they are pseudoscalars rather than scalars, 
and hence they flip the parity of a state on which they act. Thus, whereas 


(a 
the isospin raising operator P is such that 


TP 1a) = lu), (12.167) 


PG) will also produce a u-type state from a d-type one via 
TW Ja) = Jä (12.168) 
+5 ’ : 


a(l A 
but the |ú) state will have opposite parity from |u}. Further, since [T D, H] = 


O, this state |ă) will be degenerate with |d}. Similarly, the state |d) produced 


via PÈ lu) will have opposite parity from |d), and will be degenerate with 
|u). The upshot is that we have two massless states |u), |d) of (say) positive 
parity, and a further two massless states |ă), |d) of negative parity, in this 
simple model. 

Suppose we now let the quarks interact, for example by an interaction of 
the QCD type, already indicated in (12.136). In that case, the interaction 


terms have the form 


iy" Maig + d de (12.169) 
where . 
i \ /å 

ú=l ús |,d= | d (12.170) 
tig A 


and the 3 x 3 Ms act in the r-b-g space. Just as in the previous U(1) case, 
the interaction (12.169) is invariant under the global SU(2)fs chiral symmetry 
(12.164), acting in the u-d space. Note that, somewhat confusingly, (12.169) is 
not a simple ‘gauging’ of (12.163): a covariant derivative is being introduced, 
but in the space of a new (colour) degree of freedom, not in flavour space. In 
fact, the flavour degrees of freedom are ‘inert’ in (12.169), so that it is invariant 
under SU(2) transformations, while the Dirac structure implies that it is also 
invariant under chiral SU(2)£5 transformations (12.164). All the foregoing can 
be extended unchanged to chiral SU(3):5, given that QCD is ‘flavour blind’, 
and supposing that ms = 0. 
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The effect of the QCD interactions must be to bind the quark into nucle- 
ons, such as the proton (wud) and neutron (udd). But what about the equally 
possible states (44d) and (údd), for example? These would have to be degen- 
erate in mass with (uud) and (udd), and of opposite parity. Yet such ‘parity 
doublet’ partners of the physical p and n are not observed, and so we have a 
puzzle. 

One might feel that this whole discussion is unrealistic, based as it is on 
massless quarks. Are the baryons then supposed to be massless too? If so, 
perhaps the discussion is idle, as they are evidently by no means massless. But 
it is not necessary to suppose that the mass of a relativistic bound state has 
any very simple relation to the masses of its constituents: its mass may derive, 
in part at least, from the interaction energy in the fields. Alternatively, one 
might suppose that somehow the finite mass of the u and d quarks, which of 
course breaks the chiral symmetry, splits the degeneracy of the nucleon parity 
doublets, promoting the negative parity ‘nucleon’ state to an acceptably high 
mass. But this seems very implausible, in view of the actual magnitudes of 
My and mg, compared to the nucleon masses. 

In short, we have here a situation in which a symmetry of the Lagrangian 
(to an apparently good approximation) does not seem to result in the expected 
multiplet structure of the states. The resolution of this puzzle will have to 
await our discussion of ‘spontaneous symmetry breaking’, in Part VII. 

In conclusion, we note an important feature of the flavour symmetry cur- 
rents po and pe discussed in this and the preceding section. Although 
these currents have been introduced entirely within the context of strong in- 
teraction symmetries, it is a remarkable fact that exactly these currents also 
appear in strangeness-conserving semileptonic weak interactions such as 8- 
decay, as we shall see in chapter 20. (The fact that both appear is precisely 
a manifestation of parity violation in weak interactions, as we noted in sec- 
tion 4.2.1). Thus some of the physical consequences of ‘spontaneously broken 
chiral symmetry’ will involve weak interaction quantities. 


me ——3—> > rr—— 
Problems 


12.1 Verify that the set of all unitary 2 x 2 matrices with determinant equal 
to +1 form a group, the law of combination being matrix multiplication. 


12.2 Derive (12.18). 
12.3 Check the commutation relations (12.28). 


12.4 Show that the T;’s defined by (12.45) satisfy (12.47). 


12.5 Write out each of the 3 x 3 matrices TOG = 1,2,3) whose matrix 
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elements are given by (12.48), and verify that they satisfy the SU(2) commu- 
tation relations (12.47). 


12.6 Verify (12.62). 


12.7 Show that a general Hermitian traceless 3 x 3 matrix is parametrized 
by 8 real numbers. 


12.8 Check that (12.84) is consistent with (12.80) and the infinitesimal form 
of (12.81), and verify that the matrices G® defined by (12.84) satisfy the 


commutation relations (12.83). 


12.9 Verify, by comparing the coefficients of €1,€2 and ez on both sides of 
(12.99), that (12.100) follows from (12.99). 


(E 
12.10 Verify that the operators Te defined by (12.101) satisfy (12.100). 
(Note: use the anticommutation relations of the fermionic operators.) 
ree 
12.11 Verify that the operators p 
tion relations (12.103). 


given by (12.101) satisfy the commuta- 
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... The difference between a neutron and a proton is then a purely 
arbitrary process. As usually conceived, however, this arbitrariness is 
subject to the following limitations: once one chooses what to call a 
proton, what a neutron, at one space time point, one is then not free 
to make any choices at other space time points. 

It seems that this is not consistent with the localized field concept 
that underlies the usual physical theories. In the present paper we wish 
to explore the possibility of requiring all interactions to be invariant 
under independent rotations of the isotopic spin at all space time points 


—Yang and Mills (1954) 


Consider the global SU(2) isospinor transformation (12.32), written here again, 
P(x) =explia 7/20 (2) (13.1) 


for an isospin doublet wavefunction (2) (x). The dependence of y(2) (x) on 
the space-time coordinate x has now been included explicitly, but the parame- 
ters a are independent of x, which is why the transformation is called a ‘global’ 
one. As we have seen in the previous chapter, invariance under this transfor- 
mation amounts to the assertion that the choice of which two base states 
— (n,p), (u,d),... — to use is a matter of convention; any such non-Abelian 
phase transformation on a chosen pair produces another equally good pair. 
However, the choice cannot be made independently at all space-time points, 
only globally. To Yang and Mills (1954) (cf the quotation above) this seemed 
somehow an unaesthetic limitation of symmetry: ‘Once one chooses what to 
call a proton, what a neutron, at one space-time point, one is then not free 
to make any choices at other space-time points.’ They even suggested that 
this could be viewed as ‘inconsistent with the localised field concept’, and 
they therefore ‘explored the possibility’ of replacing this global (space-time 
independent) phase transformation by the local (space-time dependent) one 


vO" (2) = expligr - a(z) 20% (2) (13.2) 
in which the phase parameters a(x) are also now functions of x = (t,x) as 


39 
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indicated. Notice that we have inserted a parameter g in the exponent to 
make the analogy with the electromagnetic U(1) case 


Y" (x) = expligx(x))w(x) (13.3) 


even stronger: g will be a coupling strength, analogous to the electromagnetic 
charge q. The consideration of theories based on (13.2) was the fundamental 
step taken by Yang and Mills (1954); see also Shaw (1955). 

Global symmetries and their associated (possibly approximate) conserva- 
tion laws are certainly important, but they do not have the dynamical signif- 
icance of local symmetries. We saw in section 7.4 how the ‘requirement’ of 
local U(1) phase invariance led almost automatically to the local gauge theory 


of QED, in which the conserved current yot) of the global U(1) symmetry is 
‘promoted’ to the role of dynamical current which, when dotted into the gauge 
field Â”, gave the interaction term in Lap. A similar link between symme- 
try and dynamics appears if, following Yang and Mills, we generalize the 
non-Abelian global symmetries of the preceding chapter to local non-Abelian 
symmetries, which are the subject of the present one. 

However, as mentioned in the introduction to chapter 12, the original 
Yang-Mills attempt to get a theory of hadronic interactions by ‘localizing’ 
the flavour symmetry group SU(2) turned out not to be phenomenologically 
viable (although a remarkable attempt was made to push the idea further 
by Sakurai (1960)). In the event, the successful application of a local SU(2) 
symmetry was to the weak interactions. But this is complicated by the fact 
that the symmetry is ‘spontaneously broken’, and consequently we shall delay 
the discussion of this application until after QCD — which is the theory of 
strong interactions, but at the quark, rather than the composite (hadronic) 
level. QCD is based on the local form of an SU(3) symmetry; once again, 
however, it is not the flavour SU(3) of section 12.2, but a symmetry with 
respect to a totally new degree of freedom, colour. This will be introduced in 
the following chapter. 

Although the application of local SU(2) symmetry to the weak interactions 
will follow that of local SU(3) to the strong, we shall begin our discussion 
of local non-Abelian symmetries with the local SU(2) case, since the group 
theory is more familiar. We shall also start with the ‘wavefunction’ formalism, 
deferring the field theory treatment until section 13.3. 
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13.1 Local SU(2) symmetry 
13.1.1 The covariant derivative and interactions with matter 


In this section we shall introduce the main ideas of the non-Abelian SU(2) 
gauge theory which results from the demand of invariance, or covariance, 
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under transformations such as (13.2). We shall generally use the language of 
isospin when referring to the physical states and operators, bearing in mind 
that this will eventually mean weak isospin. 

We shall mimic as literally as possible the discussion of electromagnetic 
gauge covariance in sections 2.4 and 2.5 of volume 1. As in that case, no 
free particle wave equation can be covariant under the transformation (13.2) 
(taking the isospinor example for definiteness), since the gradient terms in the 
equation will act on the phase factor a(x). However, wave equations with a 
suitably defined covariant derivative can be covariant under (13.2); physically 
this means that, just as for electromagnetism, covariance under local non- 
Abelian phase transformations requires the introduction of a definite force 
field. 

In the electromagnetic case the covariant derivative is 


D" = Ə! + iqA" (2). (13.4) 


For convenience we recall here the crucial property of D”. Under a local U(1) 
phase transformation, a wavefunction transforms as (cf (13.3)) 


v(x) > Y (2) = expligx(z))y(z), (13.5) 


from which it easily follows that the derivative (gradient) of y transforms as 
pla) > Ə Y (2) = exp(igx(x))O"(a) + igð"x(x)expligx(z))Y(z). (13.6) 


Comparing (13.6) with (13.5), we see that, in addition to the expected first 
term on the right-hand side of (13.6), which has the same form as the right- 
hand side of (13.5), there is an extra term in (13.6). By contrast, the covariant 
derivative of y transforms as (see section 2.4 of volume 1) 


D'p(2) > Dry (x) = expligx(2))D"p(x) (13.7) 


exactly as in (13.5), with no additional term on the right-hand side. Note 
that D has to carry a prime also, since it contains A” which transforms to 
Ah = AH —0y(x) when y transforms by (13.5). The property (13.7) ensured 
the gauge covariance of wave equations in the U(1) case; the similar property 
in the quantum field case meant that a globally U(1)-invariant Lagrangian 
could be converted immediately to a locally U(1)-invariant one by replacing 
Ə! by DY (section 7.4). 

In appendix D of volume 1 we introduced the idea of ‘covariance’ in the 
context of coordinate transformations of 3- and 4-vectors. The essential notion 
was of something ‘maintaining the same form’, or ‘transforming the same 
way’. The transformations being considered here are gauge transformations 
rather than coordinate ones; nevertheless it is true that, under them, D#4% 
transforms in the same way as 4, while 04 does not. Thus the term covariant 
derivative seems appropriate. In fact, there is a much closer analogy between 
the ‘coordinate’ and the ‘gauge’ cases, which we did not present in volume 1, 
but give now in appendix N, for the interested reader. 
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We need the local SU(2) generalization of (13.4), appropriate to the local 
SU(2) transformation (13.2). Just as in the U(1) case (13.6), the ordinary 
gradient acting on y(2)(x) does not transform in the same way as (2) (1): 
taking 0* of (13.2) leads to 


Mp (a) = expligr - a(x)/2|0"p (x) 
+ igr : O“a(x)/2expligr -a(a)/J4 (2) (13.8) 


as can be checked by writing the matrix exponential exp[A] as the series 
00 
exp[A] = ‘> A" /nl 
n=0 


and differentiating term by term. By analogy with (13.7), the key property 
we demand for our SU(2) covariant derivative D” YC) is that this quantity 
should transform like (3) — i.e. without the second term in (13.8). So we 
require | ‘ 
(DY (x)) = expligr - a(x) /2|(D4y® (2). (13.9) 
The definition of D which generalizes (13.4) so as to fulfil this requirement 
is 
D” (acting on an isospinor) = 0" +igr-W*"(x)/2. (13.10) 
The definition (13.10), as indicated on the left-hand side, is only appropri- 
ate for isospinors pa; it has to be suitably generalized for other y()”s (see 
(13.44)). 
We now discuss (13.9) and (13.10) in detail. The ð” is multiplied implicitly 
by the unit 2 matrix, and the 7's act on the two-component space of Ya. 
The W*(x) are three independent gauge fields 


W" = (Wt, WS, W35, (13.11) 


generalizing the single electromagnetic gauge field A“. They are called SU(2) 
gauge fields, or more generally Yang-Mills fields. The term T - W* is then 
the 2 x 2 matrix 


(13.12) 


rw- we ae) 


3 
WË +iwt —W4 
using the 7's of (12.25); the a-dependence of the W“’s is understood. Let 
us ‘decode’ the desired property (13.9), for the algebraically simpler case of 
an infinitesimal local SU(2) transformation with parameters e(x), which are 
of course functions of x since the transformation is local. In this case, ya 


transforms by ; 3 
pay = (1 + igr - e(x) /2)p) (13.13) 


and the ‘uncovariant’ derivative Lyla) transforms by 


MY = (1 + igr - e(x)/2)0" Y) + igr -Oe(a)/24 0, (13.14) 
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where we have retained only the terms linear in € from an expansion of (13.8) 
with a — e. We have now dropped the x-dependence of the ys, but kept 
that of e(x), and we have used the simple ‘1’ for the unit matrix in the two- 
dimensional isospace. Equation (13.14) exhibits again an “extra piece” on the 
right-hand side, as compared to (13.13). On the other hand, inserting (13.10) 
and (13.13) into our covariant derivative requirement (13.9) yields, for the 
left-hand side in the infinitesimal case, 


Diya)! = (9% + igr -W"/2|11 + igr - e(2)/247) (13.15) 
while the right-hand side is 
[1 + igr - e(x)/2](9 + igr -W"/240). (13.16) 


In order to verify that these are the same, however, we would need to know 
W"" — that is, the transformation law for the three W* fields. Instead, we 
shall proceed ‘in reverse’, and use the imposed equality between (13.15) and 
(13.16) to determine the transformation law of W”. 

Suppose that, under this infinitesimal transformation, 


W" > W" = W" + ôW". (13.17) 
Then the condition of equality is 


[0% + igr/2-(W" +5W")][1 + igr - e(2)/2]p(9 
= [1 + igr - €(x)/2\(O" + igr - W*/244. (13.18) 


Multiplying out the terms, neglecting the term of second order involving the 
product of 9W* and e and noting that 


Əl (ew) = (HeJy + (Op) (13.19) 


we see that many terms cancel and we are left with 


. T-W" 2 , T- OFe(s) 
2 0 = M g 
. o | (7: elx) EWE (r: Ww" T- elx) 
+ (9) ( 2 ) ( 2 2 2 
(13.20) 
Using the identity for Pauli matrices (see problem 3.4(b)) 
o:-ao-b=a-b+io-axb (13.21) 


this yields 
TOW" = —7- OM e(x) — gt: (E(x) x W*). (13.22) 
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Equating components of 7 on both sides, we deduce 


(13.23) 


The reader may note the close similarity between these manipulations and 
those encountered in section 12.1.3. 

Equation (13.23) defines the way in which the SU(2) gauge fields W* 
transform under an infinitesimal SU(2) gauge transformation. If it were not 
for the presence of the first term 0“e(a) on the right-hand side, (13.23) would 
be simply the (infinitesimal) transformation law for the T = 1 triplet repre- 
sentation of SU(2) — see (12.64) and (12.65) in section 12.1.3. As mentioned at 
the end of section 12.2, the T = 1 representation is the ‘adjoint’, or ‘regular’, 
representation of SU(2), and this is the one to which gauge fields belong, in 
general. But there is the extra term —0"e(x). Clearly this is directly analo- 
gous to the —9x(z) term in the transformation of the U(1) gauge field A”; 
here, an independent infinitesimal function €;(2) is required for each compo- 
nent W/'(x). If the e’s were independent of x, then 0“e(x) would of course 
vanish and the transformation law (13.23) would indeed be just that of an 
SU(2) triplet. Thus we can say that under global SU(2) transformations, the 
W” behave as a normal triplet. But under local SU(2) transformations they 
acquire the additional —O“e(x) piece, and thus no longer transform ‘prop- 
erly’, as an SU(2) triplet. In exactly the same way, 0/2) did not transform 
‘properly’ as an SU(2) doublet, under a local SU(2) transformation, because 
of the second term in (13.14), which also involves 0“e(x). The remarkable re- 
sult behind the fact that DYy(2) does transform ‘properly’ under local SU(2) 
transformations, is that the extra term in (13.23) precisely cancels that in 
(13.14)! 

To summarize progress so far: we have shown that, for infinitesimal trans- 
formations, the relation 


(DY = [1 + igr - €(x) /2J (Dry) (13.24) 


(where D* is given by (13.10)) holds true if in addition to the infinitesimal 
local SU(2) phase transformation on 7)(2) 


YY = [1 + igr - e(2)/24 8 (13.25) 
the gauge fields transform according to 
W' = W" — ON e(x) — gle(x) x W*]. (13.26) 


In obtaining these results, the form (13.10) for the covariant derivative has 
been assumed, and only the infinitesimal version of (13.2) has been treated 
explicitly. It turns out that (13.10) is still appropriate for the finite (non- 
infinitesimal) transformation (13.2), but the associated transformation law 
for the gauge fields is then slightly more complicated than (13.26). Let us 
write 


U(a(x)) = expligr - a(x) /2] (13.27) 
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so that (3) transforms by 
pe = Ulale) y. (13.28) 
Then we require 
Diya! = Ula(2)) Dry. (13.29) 
The left-hand side is 
(Ə! + igr - W /2)U(a(a)) 2? 
= (HU) p 2) + Uae de® + igr -W"/2U40), (13.30) 
while the right-hand side is 


(4) 


U(t + igr - W*/2)p(2) (13.31) 
The Ury) terms cancel leaving 
(LUWE + igr - W /2Uy(9) = Uigr -W"/24(2). (13.32) 


Since this has to be true for all (two-component) (295, we can treat it as an 
operator equation acting in the space of vs to give 


DU +igr : W" /2U = Uigr - W*/2, (13.33) 
or equivalently 


1 i 1 

¿rw = {(oruyu + UT Wer}, (13.34) 
which defines the (finite) transformation law for SU(2) gauge fields. Problem 
13.1 verifies that (13.34) reduces to (13.26) in the infinitesimal case a(x) — 
e(x). 


Suppose now that we consider a Dirac equation for ya): 
(i7,0" — my) =0 (13.35) 


where both the ‘isospinor’ components of ya) are four-component Dirac 
spinors. We assert that we can ensure local SU(2) gauge covariance by re- 
placing O" in this equation by the covariant derivative of (13.10). Indeed, we 
have 


U(a(z)) [iy DY — mp? iU(a(x)) [D Y) — mU (a(s) 


2 in Dye” e my EY (13.36) 


II 


using equations (13.9) and (13.28). Thus if 


(iq, D" — my (13.37) 
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pur 
u; Up 
FIGURE 13.1 
Vertex for isospinor-W interaction. 
then 
(iyu D my" =0, (13.38) 


proving the asserted covariance. In the same way, any free particle wave 
equation satisfied by an ‘isospinor’ ya) — the relevant equation is determined 
by the Lorentz spin of the particles involved — can be made locally covariant 
by the use of the covariant derivative D*, just as in the U(1) case. 

The essential point here, of course, is that the locally covariant form in- 
cludes interactions between the 7)(2)’s and the gauge fields W”, which are 
determined by the local phase invariance requirement (the ‘gauge principle’). 
Indeed, we can already begin to find some of the Feynman rules appropriate 
to tree graphs for SU(2) gauge theories. Consider again the case of an SU(2) 
isospinor fermion, Ya, obeying equation (13.38). This can be written as 


(i P—m)b® = (7/2). Wy). (13.39) 


In lowest-order perturbation theory the one-W emission/absorption process 
is given by the amplitude (cf (8.39)) for the electromagnetic case) 


~ig ic LADY Wate (13.40) 


exactly as advertized (for the field-theoretic vertex) in (12.129). The ma- 
trix degree of freedom in the 7’s is sandwiched between the two-component 
isospinors 2); the y matrix acts on the four-component (Dirac) parts of 
yo), The external W* field is now specified by a spin-1 polarization vector 
e, like a photon, and by an ‘SU(2) polarization vector’ a” (r = 1,2,3) which 
tells us which of the three SU(2) W-states is participating. The Feynman rule 
for figure 13.1 is therefore 


~ig(r" /2) Yu (13.41) 


which is to be sandwiched between spinors/isospinors uj, uf and dotted into 
e and a”. (13.41) is a very economical generalization of rule (ii) in Comment 
(3) of section 8.3.1. 

The foregoing is easily generalized to SU(2) multiplets other than doublets. 
We shall change the notation slightly to use t instead of T for the ‘isospin’ 
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quantum number, so as to emphasize that it is not the hadronic isospin, for 
which we retain T; t will be the symbol used for the weak isospin to be 
introduced in chapter 20. The general local SU(2) transformation for a t- 
multiplet is then 


YO => y® = expliga(x) - TO Jy (13.42) 
where the (2t + 1) x (2t + 1) matrices T® (i = 1,2,3) satisfy (cf (12.47) 
POZA =p TO. (13.43) 
The appropriate covariant derivative is 
DY = ð! +igT® . W” (13.44) 


which is a (2t + 1) x (2t + 1) matrix acting on the (2t + 1) components of 
y. The gauge fields interact with such ‘isomultiplets’ in a universal way — 
only one g, the same for all the particles — which is prescribed by the local 
covariance requirement to be simply that interaction which is generated by the 
covariant derivatives. The fermion vertex corresponding to (13.44) is obtained 
by replacing 7/2 in (13.40) by TO. 

We end this section with some comments: 


(i) It is a remarkable fact that only one constant g is needed. This is not the 
same as in electromagnetism. There, each charged field interacts with the 
gauge field A” via a coupling whose strength is its charge (e, —e, 2e, —5e...). 
The crucial point is the appearance of the quadratic g? multiplying the 
commutator of the T's, [r-€,7- W], in the W” transformation (equation 
(13.20)). In the electromagnetic case, there is no such commutator — the 
associated U(1) phase group is Abelian. As signalled by the presence of 
g?, a commutator is a non-linear quantity, and the scale of quantities ap- 
pearing in such commutation relations is not arbitrary. It is an instructive 
exercise to check that, once 9W*” is given by equation (13.23) — in the 
SU(2) case — then the g’s appearing in 2)! (equation (13.13)) and y” 
(via the infinitesimal version of equation (13.42)) must be the same as the 
one appearing in ÂW”. 


(ii) According to the foregoing argument, it is actually a mystery why electric 
charge should be quantized. Since it is the coupling constant of an Abelian 
group, each charged field could have an arbitrary charge from this point 
of view: there are no commutators to fix the scale. This is one of the 
motivations of attempts to ‘embed’ the electromagnetic gauge transfor- 
mations inside a larger non-Abelian group structure. Such is the case, for 
example, in ‘grand unified theories’ of strong, weak and electromagnetic 
interactions. 
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(iii) Finally we draw attention to the extremely important physical significance 
of the second term W” (equation (13.23)). The gauge fields themselves 
are not ‘inert’ as far as the gauge group is concerned: in the SU(2) case 
they have ‘isospin’ 1, while for a general group they belong to the regular 
representation of the group. This is profoundly different from the elec- 
tromagnetic case, where the gauge field A” for the photon is of course 
uncharged: quite simply, e = 0 for a photon, and the second term in 
(13.23) is absent for A“. The fact that non-Abelian (Yang-Mills) gauge 
fields carry non-Abelian ‘charge’ degrees of freedom means that, since 
they are also the quanta of the force field, they will necessarily interact 
with themselves. Thus a non-Abelian gauge theory of gauge fields alone, 
with no ‘matter’ fields, has non-trivial interactions and is not a free theory. 


We shall examine the form of these ‘self-interactions’ in section 13.3.2. 
First, we need to find the equivalent, for the Yang-Mills field, of the Maxwell 
field strength tensor F*”, which gave us the gauge-invariant formulation of 
Maxwell’s equations, and in terms of which the Maxwell Lagrangian can be 
immediately written down. 


13.1.2 The non-Abelian field strength tensor 


A simple way of arriving at the desired quantity is to consider the commutator 
of two covariant derivatives, as we can see by calculating it for the U(1) case. 
We find 

[D*, D"] y = (Dt DY — DD = ieF"” 4y (13.45) 


as is verified in problem 13.2. Equation (13.45) suggests that we will find the 
SU(2) analogue of F*” by evaluating 


[DE, D] p@ (13.46) 


where as usual 
Dh(on Y) = 0" +igr -W" /2. (13.47) 


Problem 13.3 confirms that the result is 
[D", DY] bp) = igr/2- (O"WY — W — gW} x WY): (13.48) 


the manipulations are very similar to those in (13.20)-(13.23). Noting the 
analogy between the right-hand side of (13.48) and (13.45), we accordingly 
expect the SU(2) ‘curvature’ or field strength tensor, to be given by 


FY = o9 W” — oW" — gW" x WY (13.49) 
or, in component notation, 


FEY = "WY — 8 WE — gesjn WWE. (13.50) 
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This tensor is of fundamental importance in a (non-Abelian) gauge theory. 
Since it arises from the commutator of two gauge-covariant derivatives, we are 
guaranteed that it itself is gauge covariant — that is to say, ‘it transforms under 
local SU(2) transformations in the way its SU(2) structure would indicate’. 
Now F*” has clearly three SU(2) components and must be an SU(2) triplet: 
indeed, it is true that under an infinitesimal local SU(2) transformation 


F” = FY _ ge(x) x FH” (13.51) 


which is the expected law (cf (12.64)) for an SU(2) triplet. Problem 13.4 
verifies that (13.51) follows from (13.49) and the transformation law (13.23) 
for the W” fields. Note particularly that F*” transforms ‘properly’, as an 
SU(2) triplet should, without the 0 part which appears in 68W”. 

This non-Abelian F*” is a much more interesting object than the Abelian 
F*" (which is actually U(1)-gauge invariant, of course: PP” = PHY), pr” 
contains the gauge coupling constant g, confirming (cf comment(c) in sec- 
tion 13.1.1) that the gauge fields themselves carry SU(2) ‘charge’, and act 
as sources for the field strength. Appendix N shows how these field strength 
tensors may be regarded as analogous to geometrical curvatures. 

It is now straightforward to move to the quantum field case and construct 
the SU(2) Yang-Mills analogue of the Maxwell Lagrangian -1f uF uv Tt is 
simply -ifu -P" the SU(2) ‘dot product’ ensuring SU(2) invariance (see 
problem 13.5), even under local transformation, in view of the transformation 
law (13.51). But before proceeding in this way we first need to introduce local 
SU(3) symmetry. 


ae a 


13.2 Local SU(3) Symmetry 


Using what has been done for global SU(3) symmetry in section 12.2, and 
the preceding discussion of how to make a global SU(2) into a local one, it 
is straightforward to develop the corresponding theory of local SU(3). This 
is the gauge group of QCD, the three degrees of freedom of the fundamental 
quark triplet now referring to “colour”, as will be further discussed in chapter 
14. We denote the basic triplet by w, which transforms under a local SU(3) 
transformation according to 


y" = expligsă - a(x)/2]y, (13.52) 


which is the same as the global transformation (12.74) but with the 8 constant 
parameters a replaced by z-dependent ones, and with a coupling strength gs 
inserted. The SU(3)-covariant derivative, when acting on an SU(3) triplet 4, 
is given by the indicated generalization of (13.10), namely 


D” (acting on SU(3) triplet) = 0” + igsA/2- A” (13.53) 
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where Af, A5,... A£ are eight gauge fields which are called gluons. The cou- 
pling is denoted by ‘gs’ in anticipation of the application to strong interactions 
via QCD. 

The infinitesimal version of (13.52) is (cf (13.13)) 


Y = (1+ ig d-n(a)/2¥ (13.54) 
where ‘1’ stands for the unit matrix in the three-dimensional space of com- 
ponents of the triplet 4. As in (13.14), it is clear that 04” will involve an 
‘unwanted’ term 0'*n(x). By contrast, the desired covariant derivative DP 
should transform according to 


Ditop! = (1+ igsA - n(x) /2)D" (13.55) 


without the 0"n(x) term. Problem 13.6 verifies that this is fulfilled by having 
the gauge fields transform by 


A = Al — 0" na(x) — gs fave (x) AL. (13.56) 


Comparing (13.56) with (12.80) we can identify the term in fabe as telling us 
that the 8 fields AY transform as an SU(3) octet, the 7’s now depending on 
x, of course. This is the adjoint, or regular representation of SU(3), as we 
have now come to expect for gauge fields. However, the 07, (1) piece spoils 
this simple transformation property under local transformations. But it is 
just what is needed to cancel the corresponding 0'*n(x) term in 0“, leaving 
Dhy transforming as a proper triplet via (13.55). The finite version of (13.56) 
can be derived as in section 13.1 for SU(2), but we shall not need the result 
here. 
As in the SU(2) case, the free Dirac equation for an SU(3)-triplet y, 


(i7,0" — mY = 0, (13.57) 


can be ‘promoted’ into one which is covariant under local SU(3) transforma- 
tions by replacing 0" by D” of (13.53), leading to 


(i 9 — m)p = gsA/2- Ay (13.58) 


(compare (13.39)). This leads immediately to the one gluon emission ampli- 
tude (see figure 13.2) 


—19s J peA /27 yi - A d'z (13.59) 


as already suggested in section 12.3.1: the SU(3) current of (12.133) — but 
this time in colour space — is ‘dotted’ with the gauge field. The Feynman rule 
for figure 13.2 is therefore 

—i9sAa /2 y". (13.60) 
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FIGURE 13.2 
Quark-gluon vertex. 


The SU(3) field strength tensor can be calculated by evaluating the com- 
mutator of two D’s of the form (13.53); the result (problem 13.7) is 


FEY = O! AY — O” Al — gs fanc Al AV (13.61) 


which is closely analogous to the SU(2) case (13.50) (the structure constants 
of SU(2) are given by ie;;,, and of SU(3) by ifabc). Once again, the crucial 
property of Fl” is that, under local SU(3) transformations it develops no 
‘OM'n,’ part, but transforms as a ‘proper’ octet: 


El” = FRY — gs fave (a) BBY. (13.62) 


This allows us to write down a locally SU(3)-invariant analogue of the Maxwell 
Lagrangian 


1 
GE Faw (13.63) 


by dotting the two octets together. 

It is now time to consider locally SU(2)- and SU(3)-invariant quantum 
field Lagrangians and, in particular, the resulting self-interactions among the 
gauge quanta. 


ra a 


13.3 Local non-Abelian symmetries in Lagrangian 
quantum field theory 


13.3.1 Local SU(2) and SU(3) Lagrangians 


We consider here only the particular examples relevant to the strong and elec- 
troweak interactions of quarks: namely, a (weak) SU(2) doublet of fermions in- 
teracting with SU(2) gauge fields W/', and a (strong) SU(3) triplet of fermions 
interacting with the gauge fields 44. We follow the same steps as in the U(1) 
case of chapter 7, noting again that for quantum fields the sign of the expo- 
nents in (13.2) and (13.52) is reversed, by convention; thus (12.89) is replaced 
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FIGURE 13.3 
SU(2) gauge-boson propagator. 


by its local version 
q = exp(—igâ(x) - 7/2) (13.64) 


and (12.132) by 
= exp(—igs@(x) - A/2)@. (13.65) 


a) 
q 
Correspondingly, the e in (13.23) and the 7’s in (13.56) become field operators, 
with a reversal of sign. 
The globally SU(2)-invariant Lagrangian (12.87) becomes locally SU(2)- 
invariant if we replaced 0% by D* of (13.10), with W* now a quantum field: 


(1D — m) 
(Pm) 


with an interaction of the form ‘symmetry current (12.109) dotted into the 
gauge field’. To this we must add the SU(2) Yang-Mills term 


LD local SU(2) 


å 
å 


D > 


— gåy"T/2å4- W „ (13.66) 


1. a [LV 


Ly_m,su(2) = -gfe -F (13.67) 


to get the local SU(2) analogue of Lorm. It is not possible to add a mass term 
for the gauge fields of the form tw" i Wis since such a term would not be 
invariant under the gauge transformations (13.26) or (13.34) of the W-fields. 
Thus, just as in the U(1) (electromagnetic) case, the W-quanta of this theory 
are massless. We presumably also need a gauge-fixing term for the gauge 
fields, as in section 7.3.2, which we can take to be! 


1 

26 
The Feynman rule for the fermion-W vertex is then the same as already given 
in (13.41), while the W-propagator is (figure 13.3) 


Les = -z (0 W" -9,W”). (13.68) 


i gh + (1 — Eh /k?] 
k2 + ie 
Before proceeding to the SU(3) case, we must now emphasize three respects 


sii, (13.69) 


1We shall see in section 13.5.3 that in the non-Abelian case this gauge-fixing term does 
not completely solve the problem of quantizing such gauge fields; however, it is adequate 
for tree graphs. 
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in which our local SU(2) Lagrangian is not suitable (yet) for describing weak 
interactions. First, weak interactions violate parity, in fact ‘maximally’, by 
which is meant that only the ‘left-handed’ part or of the fermion field enters 
the interactions with the W” fields, where ju = (3 22) q); for this reason 
the weak isospin group is called SU(2)L. Secondly, the physical W* are of 
course not massless, and therefore cannot be described by propagators of the 
form (13.69). And thirdly, the fermion mass term violates the “left-handed” 
SU(2) gauge symmetry, as the discussion in section 12.3.2 shows. In this 
case, however, the chiral symmetry which is broken by fermion masses in the 
Lagrangian is a local, or gauge, symmetry (in section 12.3.2 the chiral flavour 
symmetry was a global symmetry). If we want to preserve the chiral gauge 
symmetry SU(2)L — and it is necessary for renormalizability — then we shall 
have to replace the simple fermion mass term in (13.66) by something else, as 
will be explained in chapter 22. 

The locally SU(3).-invariant Lagrangian for one quark triplet (cf (12.137)) 


f 
ĉ&=| fo Je (13.70) 
fe 
where ‘f’ stands for ‘flavour’, and ‘r, b, and g’ for ‘red, blue, and green’, is 
SERA Pe eee % Ka 
del) — mijár — Fay Ea — e (Âa) (0v Aa) (13.71) 


where D* is given by (13.53) with A” replaced by A”, and the footnote 
before equation (13.68) also applies here. This leads to the interaction term 
(cf (13.59)) 


—gsĝey A /2âe - Ay (13.72) 


and the Feynman rule (13.60) for figure 13.2. Once again, the gluon quanta 
must be massless, and their propagator is the same as (13.69), with 0,5 — 
dab (a,b = 1,2,...8). The different quark flavours are included by simply 
repeating the first term of (13.71) for all flavours: 


D4 — med, (13.73) 


which incorporates the hypothesis that the SU(3).-gauge interaction is ‘flavour- 
blind’, i.e. exactly the same for each flavour. Note that although the flavour 
masses are different, the masses of different ‘coloured’ quarks of the same 
flavour are the same (my Æ Ma, Mur = Mub = My g). 

The Lagrangians (13.66)-(13.68), and (13.71), though easily written down 
after all this preparation, are unfortunately not adequate for anything but 
tree graphs. We shall indicate why this is so in section 13.3.3. Before that, we 
want to discuss in more detail the nature of the gauge-field self-interactions 
contained in the Yang-Mills pieces. 
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13.3.2 Gauge field self-interactions 


We start by pointing out an interesting ambiguity in the prescription for 
‘covariantizing’ wave equations which we have followed, namely ‘replace 0" 
by D*”. Suppose we wished to consider the electromagnetic interactions of 
charged massless spin-1 particles, call them X’s, carrying charge e. The stan- 
dard wave equation for such free massless vector particles would be the same 
as for A”, namely 


XxX" — ƏV X, =0. (13.74) 


To ‘covariantize’ this (i.e. introduce the electromagnetic coupling) we would 
replace 0" by D” = OH + ie A! so as to obtain 


D? X” — DD X, = 0. (13.75) 


But this procedure is not unique: if we had started from the perfectly equiv- 
alent wave equation 


X” — 99X, =0 (13.76) 


we would have arrived at 
D?X" — D’ DEX, =0 (13.77) 
which is not the same as (13.75), since (cf (13.45)) 
[D*, D”] =ieF*”. (13.78) 


The simple prescription O! — D* has, in this case, failed to produce a 
unique wave equation. We can allow for this ambiguity by introducing an 
arbitrary parameter 6 in the wave equation, which we write as 


D?’ X” — DY DY X, + ied FX” X, = 0. (13.79) 


The 6 term in (13.79) contributes to the magnetic moment coupling of the 
X-particle to the electromagnetic field, and is called the ‘ambiguous magnetic 
moment’. Just such an ambiguity would seem to arise in the case of the 
charged weak interaction quanta W= (their masses do not affect this argu- 
ment). For the photon itself, of course, e = 0 and there is no such ambiguity. 

It is important to be clear that (13.79) is fully U(1) gauge-covariant, so that 
ô cannot be fixed by further appeal to the local U(1) symmetry. Moreover, it 
turns out that the theory for arbitrary 6 is not renormalizable (though we shall 
not show this here): thus the quantum electrodynamics of charged massless 
vector bosons is in general non-renormalizable. 

However, the theory is renormalizable if — to continue with the present 
terminology — the photon, the X-particle, and its antiparticle the X are the 
members of an SU(2) gauge triplet (like the W’s), with gauge coupling con- 
stant e. This is, indeed, very much how the photon and the W* are ‘unified’, 
but there is a complication (as always!) in that case, having to do with the 
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necessity for finding room in the scheme for the neutral weak boson ZO as 
well. We shall see how this works in chapter 19; meanwhile we continue with 
this X — y model. We shall show that when the X — y interaction contained 
in (13.79) is regarded as a 3 — X vertex in a local SU(2) gauge theory, the 
value of 6 has to equal 1; for this value the theory is renormalizable. In this 
interpretation, the X” wave function is identified with h (Xf +iX5) and 
X! with 5 (Xf —iXSf) in terms of components of the SU(2) triplet X”, 
while A” is identified with Xf. 
Consider then equation (13.79) written in the form? 


XAO, = VX" (13.80) 
where 
VX* = —ie{[0" (A, X") + 43, X"] 
— (1+8) [0"(4*X,) + A”! X,] 
+ S [0 (A”X,) + A4*9"X,]), (13.81) 


and we have dropped terms of O(e?) which appear in the ‘D?’ term; we shall 
come back to them later. The terms inside the { } brackets have been written 
in such a way that each |] bracket has the structure 


a(AX) + A(OX) (13.82) 


which will be convenient for the following evaluation. 
The lowest-order (O(e)) perturbation theory amplitude for ‘X > X’ under 
the potential V is then 


—i J XV X" (i)d*z. (13.83) 


Inserting (13.81) into (13.83) clearly gives something involving two ‘X’-wave- 
functions and one ‘A’ one, i.e. a triple-X vertex (with A” = X4), shown in 
figure 13.4. To obtain the rule for this vertex from (13.83), consider the first 
| ] bracket in (13.81). It contributes 


—i(—ie) / XZH (X3,(3)X#(1)) + X¥(3)0,X"(1)}d4ar (13.84) 


where the (1), (2), (3) refer to the momenta as shown in figure 13.4, and for 
reasons of symmetry are all taken to be ingoing; thus 


X (3) = eexp(—iks - x) (13.85) 


2The sign chosen for V here apparently differs from that in the KG case (3.101), but 
it does agree when allowance is made, in the amplitude (13.83), for the fact that the dot 
product of the polarization vectors is negative (cf (7.87)). 
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FIGURE 13.4 
Triple-X vertex. 


for example. The first term in (13.84) can be easily evaluated by a partial 
integration to turn the 0” onto the X5(2), while in the second term O, acts 
straightforwardly on X“(1). Omitting the usual (27)* 64 energy-momentum 
conserving factor, we find (problem 13.8) that (13.84) leads to the amplitude 


lee] :€2 (kı = k2) MES (13.86) 
In a similar way, the other terms in (13.83) give 
—ied(€1 ES) 69: ko ED ER ET kı) (13.87) 


and 
+ie(1 + d)(€2 CELEP” ko — 63600 ky). (13.88) 


Adding all the terms up and using the 4-momentum conservation condition 
kı + k2 + k3 =0 (13.89) 
we obtain the vertex 
+ie{€1 + c2 (ki — ka) -€3 + eo: c3 (k2 — k3) -€1 +€3:€1 (ka —0k1)-€2). (13.90) 


It is quite evident from (13.90) that the value ô = 1 has a privileged role, 
and we strongly suspect that this will be the value selected by the proposed 
SU(2) gauge symmetry of this model. We shall check this in two ways: in the 
first, we consider a ‘physical’ process involving the vertex (13.90), and show 
how requiring it to be SU(2)-gauge invariant fixes 6 to be 1; in the second, we 
‘unpack’ the relevant vertex from the compact Yang-Mills Lagrangian — LX ae 
x 

The process we shall choose is X +d > X +d where d is a fermion (which 
we call a quark) transforming as the T3 = —4 component of a doublet under 
the SU(2) gauge group, its Ts = +4 partner being the u. There are two 
contributing Feynman graphs, shown in figure 13.5(a) and (b). Consider first 
the amplitude for figure 13.5(a). We use the rule of figure 13.1, with the 7- 
matrix combination T} = (7, + î72)/V2 corresponding to the absorption of 
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(a) 


FIGURE 13.5 
Tree graphs contributing to X +d > X +d. 


the positively charged X, and T- = (7 — i72)/V2 for the emission of the X. 
Then figure 13.5(a) is 


i 


pit Ki- m 


1 


(~ie)? (pa) d2 = Ab) (pr) (13.91) 


where 
pe = ( ) (13.92) 


and we have chosen real polarization vectors. Using the explicit forms (12.25) 
for the 7-matrices, (13.91) becomes 


pa 1 i 1 
(—ie) dp) AI Aa fd 

We must now discuss how to implement gauge invariance. In the QED case 
of electron Compton scattering (section 8.6.2), the test of gauge invariance was 
that the amplitude should vanish if any photon polarization vector e” (k) was 
replaced by k” — see (8.165). This requirement was derived from the fact that a 
gauge transformation on the photon A” took the form A” > 4 = A” — Oh, 
so that, consistently with the Lorentz condition, e” could be replaced by 
eh = e” +Bk” (cf 8.163) without changing the physics. But the SU(2) analogue 
of the U(1) gauge transformation is given by (13.26), for infinitesimal e’s, and 
although there is indeed an analogous ‘—0“e’ part, there is also an additional 
part (with g — e in our case) expressing the fact that the X’s carry SU(2) 
charge. However this extra part does involve the coupling e. Hence, if we were 
to make the full change corresponding to (13.26) in a tree graph of order e?, 
the extra part would produce a term of order e?. We shall take the view that 
gauge invariance should hold at each order of perturbation theory separately; 
thus we shall demand that the tree graphs for X-d scattering, for example, 
should be invariant under e” — k for any e. 

The replacement e, > kı in (13.93) produces the result (problem 13.9) 


(p1). (13.93) 


(ie)? dez) dodo) (13.94) 
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FIGURE 13.6 
Tree graphs contributing to y +X > 7+ X. 


where we have used the Dirac equation for the quark spinors of mass m. The 
term (13.94) is certainly not zero, but we must of course also include the 
amplitude for figure 13.5(b). Using the vertex of (13.90) with suitable sign 
changes of momenta, and the photon propagator of (7.119), and remembering 
that d has 73 = —1, the amplitude for figure 13.5(b) is 


ie[ey - €2 (ky + k2)u +  €2u€1* ( 6ko — ka + kı) F €1p€2 * (ko — kı — ôkı)] 


E x [-ied(p2) (-5) wd (p1)|, (13.95) 


where q? = (kı — k2)? = —2kı - ka using k? = k2 = 0, and where the £- 
dependent part of the y-propagator vanishes since d(p2) ¢d(p1) = 0. We now 
leave it as an exercise (problem 13.10) to verify that, when e, > kı in (13.95), 
the resulting amplitude does exactly cancel the contribution (13.94), provided 
that 6 = 1. Thus the X— X -y vertex is, assuming the SU(2) gauge symmetry, 


ie|ea - co (ky — ka) - €3 + ea: €3 (k2 — k3) + ca ke: ca (k3 — k1)- €2]. (13.96) 

The verification of this non-Abelian gauge invariance to order e? is, of 
course, not a proof that the entire theory of massless X quanta, y's and quark 
isospinors will be gauge invariant if 6 = 1. Indeed, having obtained the 
X — X — y vertex, we immediately have something new to check: we can see if 
the lowest-order y— X scattering amplitude is gauge invariant. The X- X—y 
vertex will generate the O(e?) graphs shown in figure 13.6, and the dedicated 
reader may check that the sum of these amplitudes is not gauge invariant, 
again in the (tree-graph) sense of not vanishing when any e is replaced by the 
corresponding k. But this is actually correct. In obtaining the X — X — y 
vertex we dropped an O(e?) term involving the three fields A, A and X, in 
going from (13.81) to (13.90): this will generate an O(e?) y- y- X-X 
interaction, figure 13.7, when used in lowest-order perturbation theory. One 
can find the amplitude for figure 13.7 by the gauge invariance requirement 
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y A 


FIGURE 13.7 
y—y— X-X vertex. 


applied to figures 13.6 and 13.7, but it has to be admitted that this approach 
is becoming laborious. It is, of course, far more efficient to deduce the vertices 
from the compact Yang-Mills Lagrangian -1X jë >, , which we shall now 
do; nevertheless, some of the physical pliéatiDas of ho couplings, such as 
we have discussed above, are worth exposing. 

The SU(2) Yang-Mills Lagrangian for the SU(2) triplet of gauge fields X " 
is 


Ê> ym = Hp >. ie (13.97) 
where 
R” OURO -or R" SO eX. (13.98) 
Loy can be unpacked a bit into 
- 5(0,%,—0,2,) (R) 
+ e(X, x X,)- aux” 
= le a" EX)? SERVO Al (13.99) 


The X — X — y vertex is in the ‘e’ term, the X — X — y — y one in the “e?” 
term. We give the form of the latter using SU(2) ʻi, j, k’ labels, as shown in 
figure 13.8: 


:,2 
—1e [€ije€mne(€1 :€3€2 * €4 — €1 * €4 €2° €3) 
gz EintEjme (ea :€2 €3 ` €4 — €1 ` €3 €2° €4) 


+ EimtEnjeles ` €4 €2 - €3 — €1 : €2 €3 ` €4)] (13.100) 


The reason for the collection of terms seen in (13.96) and (13.100) can be 
understood as follows. Consider the 3 — X vertex 


(ko, ea ji k3,€3,k | e(X x Ño) HX” | ki, 1,7) (13.101) 
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Ea) k 


R: AEM 


FIGURE 13.8 
4 — X vertex. 


for example. When each X is expressed as a mode expansion, and the initial 
and final states are also written in terms of appropriate â's and â!'s, the 
amplitude will be a vacuum expectation value (vev) of six â's and â!'s; the 
different terms in (13.96) arise from the different ways of getting a non-zero 
value for this vev, by manipulations similar to those in section 6.3. 

We end this chapter by presenting an introduction to the problem of quan- 
tizing non-Abelian gauge field theories. Our aim will be, first, to indicate 
where the approach followed for the Abelian gauge field A" in section 7.3.2 
fails; and then to show how the assumption (nevertheless) that the Feyn- 
man rules we have established for tree graphs work for loops as well, leads 
to violations of unitarity. This calculation will indicate a very curious way of 
remedying the situation ‘by hand’, through the introduction of ghost particles, 
only present in loops. 


13.3.3 Quantizing non-Abelian gauge fields 


We consider for definiteness the SU(2) gauge theory with massless gauge fields 


w" (x), which we shall call gluons, by a slight abuse of language. We try to 
carry through for the Yang-Mills Lagrangian 


i est, iu 
În = — ZF w P”, (13.102) 

where A A > y A 
Fu =09,W,-0,W, -gW,xW,, (13.103) 


the same steps we followed for the Maxwell one in section 7.3.2. 
We begin by re-formulating the prescription arrived at in (7.119), which 
we reproduce again here for convenience: 


ZI CES 1 2: 
be =~ GP Pm — gO}. (13.104) 


Ĉe leads to the equation of motion 


Â! — ata, AY + ¿OA =0. (13.105) 
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This has the drawback that the limit £ > 0 appears to be singular (though the 
propagator (7.122) is well-behaved as € > 0). To avoid this unpleasantness, 
consider the Lagrangian (Lautrup 1967) 


Ă E O ee 
Len == Fw E!" + Bô, A" + ¿eb (13.106) 


where B is a scalar field. We may think of the “BO. A’ term as a field theory 
analogue of the procedure followed in classical Lagrangian mechanics, whereby 
a constraint (in this case the gauge-fixing one ô- A = 0) is brought into the 
Lagrangian with a ‘Lagrange multiplier’ (here the auziliary field Ê). The 
momentum conjugate to Â? is now 


î0 =B (13.107) 


while the Euler-Lagrange equations for At” read 


Â! — 040, A” = ð” Ê, (13.108) 


and for B yield i i 
0, A" +EB =0. (13.109) 


Eliminating B from (13.106) by means of (13.109) we recover (13.104). Taking 
O, of (13.108) we learn that B = 0, so that B is a free massless field. 
Applying O to (13.109) then shows that Oô, Â! = 0, so that „Â is also a 
free massless field. 

In this formulation, the appropriate subsidiary condition for getting rid of 
the unphysical (non-transverse) degrees of freedom is (cf (7.111)) 


BOW (x) | Y) =0. (13.110) 


Kugo and Ojima (1979) have shown that (13.110) provides a satisfactory def- 
inition of the Hilbert space of states. In addition to this it is also essential to 
prove that all physical results are independent of the gauge parameter £. 
We now try to generalize the foregoing in a straightforward way to (13.102). 
The obvious analogue of (13.106) would be to consider 
A 1, A uv A a p thae să 
Loe B =-3Êw: P" +B. (0,W") +368- È (13.111) 
where B is an SU(2) triplet of scalar fields. Equation (13.111) gives (cf 
(13.108)) j N | 
(D”) i; Piu + 9 Bi = 0 (13.112) 


where the covariant derivative is now the one appropriate to the SU(2) triplet 
Fy, (see (13.44) with t = 1, and (12.48)), and i,j are the SU(2) labels. 
Similarly, (13.109) becomes 


ðw" +éB=0. (13.113) 
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It is possible to verify that 
(Di (Dig Fj = 0 (13.114) 
where i, j,k are the SU(2) matrix indices, which implies that 
(Dridu Bi = 0. (13.115) 


This is the crucial result: it implies that the auxiliary field B is not a free 
field in this non-Abelian case, and so neither (from (13.113)) is 0, Ww". In 
consequence, the obvious generalizations of (7.108) or (13.110) cannot be used 
to define the physical (transverse) states. The reason is that a condition like 
(13.110) must hold for all times, and only if the field is free is its time variation 
known (and essentially trivial). 

Let us press ahead nevertheless, and assume that the rules we have derived 
so far are the correct Feynman rules for this gauge theory. We will see that 
this leads to physically unacceptable consequences, namely to the violation of 
unitarity. 

In fact, this is a problem which threatens all gauge theories if the gauge 
field is treated covariantly, i.e. as a 4-vector. As we saw in section 7.3.2, this 
introduces unphysical degrees of freedom which must somehow be eliminated 
from the theory, or at least prevented from affecting physical processes. In 
QED we do this by imposing the condition (7.111), or (13.110), but as we 
have seen the analogous conditions will not work in the non-Abelian case, and 
so unphysical states may make their presence felt, for example in the ‘sum 
over intermediate states’ which arises in the unitarity relation. This relation 
determines the imaginary part of an amplitude via an equation of the form 
(cf (11.65)) 


2 Im (f | M | i) = [EIM molt dae (13.116) 


where (f | M | i) is the (Feynman) amplitude for the process i > f, and 
the sum is over a complete set of physical intermediate states | n), which 
can enter at the given energy; dp, represents the phase space element for 
the general intermediate state | n). Consider now the possibility of gauge 
quanta appearing in the states | n). Since unitarity deals only with physical 
states, such quanta can have only the two degrees of freedom (polarizations) 
allowed for a physical massless gauge field (cf section 7.3.1). Now part of the 
power of the ‘Feynman rules’ approach to perturbation theory is that it is 
manifestly covariant. But there is no completely covariant way of selecting 
out just the two physical components of a massless polarization vector €,,, 
from the four originally introduced precisely for reasons of covariance. In 
fact, when gauge quanta appear as virtual particles in intermediate states in 
Feynman graphs, they will not be restricted to having only two polarization 
states (as we shall see explicitly in a moment). Hence there is a real chance 


13.3. Local non-Abelian symmetries in Lagrangian quantum field theory 63 


«Ol 
Ol 


E kay A 


FIGURE 13.9 
Two-gluon intermediate state in the unitarity relation for the amplitude for 


qq > qq. 


that when the imaginary part of such graphs is calculated, a contribution from 
the unphysical polarization states will be found, which has no counterpart at 
all in the physical unitarity relation, so that unitarity will not be satisfied. 
Since unitarity is an expression of conservation of probability, its violation is 
a serious disease indeed. 

Consider, for example, the process qq — qq (where the ‘quarks’ are an 
SU(2) doublet), whose imaginary part has a contribution from a state con- 
taining two gluons (figure 13.9): 


2 Im (99 | M | aa) = J Y (aa | M | ge)(gg | MÌ | aa)dp2 (13.117) 


where dpa is the 2-body phase space for the g-g state. The 2-gluon amplitudes 
in (13.117) must have the form 


Myvi et (ki, Ar)ez' (ka, A2) (13.118) 


where e*(k, A) is the polarization vector for the gluon with polarization A and 
4-momentum k. The sum in (13.117) is then to be performed over Ay = 1,2 
and A2 = 1,2 which are the physical polarization states (cf section 7.3.1). 
Thus (13.117) becomes 


Zim Masa = |O E Mane AE (has 22) 


Ai=1,2;A2=1,2 
x M* ey (k1, A1)e5? (k2, Az) dpa. (13.119) 


H2V2 


For later convenience we are using real polarization vectors as in (7.81) and 
(7.82): e(k;, A; = +1) = (0, 1,0,0), e(k;, Aj = —1) = (0,0,1,0); and of course 
k? = k2 =0. 

We now wish to find out whether or not a result of the form (13.119) 
will hold when the M’s represent some suitable Feynman graphs. We first 
note that we want the unitarity relation (13.119) to be satisfied order by 
order in perturbation theory: that is to say, when the M’s on both sides are 
expanded in powers of the coupling strengths (as in the usual Feynman graph 
expansion), the coefficients of corresponding powers on each side should be 
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FIGURE 13.10 
Some O(g*) contributions to qq — qq. 


equal. Since each emission or absorption of a gluon produces one power of the 
SU(2) coupling g, the right-hand side of (13.119) involves at least the power 
gt. Thus the lowest-order process in which (13.119) may be tested is for the 
n 
to MÅ aa some of which are shown in Figure 13.10; all contain a loop. On 
the right-hand side of (13.119), each M involves two polarization vectors, and 


fourth-order amplitude M There are quite a number of contributions 


so each must represent the 0(9?) contribution to qq > gg, which we call MD: 
thus both sides are consistently of order gt. There are three contributions to 
MY shown in figure 13.11; when these are placed in (13.119), contributions 
to the imaginary part of MÊ, a are generated, which should agree with the 
imaginary part of the total 0(g*) loop-graph contribution. Let us see if this 
works out. We choose to work in the gauge € = 1, so that the gluon propagator 
takes the familiar form —ig””6;;/k2. According to the rules for propagators 
and vertices already given, each of the loop amplitudes Mae (e.g. those 
of figure 13.10) will be proportional to the product of the propagators for the 
quarks and the gluons, together with appropriate ‘y’ and ‘T’ vertex factors, 
the whole being integrated over the loop momentum. The extraction of the 
imaginary part of a Feynman diagram is a technical matter, having to do with 
careful consideration of the ‘ie’ in the propagators. Rules for doing this exist 
(Eden et al. 1966, section 2.9), and in the present case the result is that, to 
compute the imaginary part of the amplitudes of figure 13.10, one replaces 
each gluon propagator of momentum k by 


m™(—g"”)5(k?)0(ko)6i;- (13.120) 


That is, the propagator is replaced by a condition stating that, in evaluating 
the imaginary part of the diagram, the gluon’s mass is constrained to have 
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FIGURE 13.11 
O(g?) contributions to qq > gg. 


the physical (free-field) value of zero, instead of varying freely as the loop 
momentum varies, and its energy is positive. These conditions (one for each 
gluon) have the effect of converting the loop integral with a standard two-body 
phase space integral for the gg intermediate state, so that eventually 


2m M aa = A, MD, (<9) (13.121) 
where MO, is the sum of the three O(g?) tree graphs shown in figure 13.11, 
with all external legs satisfying the ‘mass-shell’ conditions. 
So, the imaginary part of the loop contribution to ME does seem to 
have the form (13.116) as required by unitarity, with |n) the gg intermediate 
state as in (13.119). But there is one essential difference between (13.121) and 
(13.119): the place of the factor —g*” in (13.121) is taken in (13.119) by the 


gluon polarization sum 


Pr (k) = Y (k, Ae (k, A) (13.122) 
A=1,2 


for k = kı, k2 and A = A, Az respectively. Thus we have to investigate whether 
this difference matters. 

To proceed further, it is helpful to have an explicit expression for PY. We 
might think of calculating the necessary sum over A by brute force, using two 
€s specified by the conditions (cf (7.87)) 


e (k, A)en(k, AM) =—011, k=O. (13.123) 


The trouble is that conditions (13.123) do not fix the €s uniquely if k? = 
0. (Note the 6(k?) in (13.120)). Indeed, it is precisely the fact that any 
given e, satisfying (13.123) can be replaced by €, + Ak, that both reduces 
the degrees of freedom to two (as we saw in section 7.3.1), and evinces the 
essential arbitrariness in the e,, specified only by (13.123). In order to calculate 
(13.122), we need to put another condition on e,,, so as to fix it uniquely. A 
standard choice (see e.g. Taylor 1976, pp 14-15) is to supplement (13.123) 
with the further condition 

t-e=0 (13.124) 
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where t is some 4-vector. This certainly fixes e,,, and enables us to calculate 
(13.122), but of course now two further difficulties have appeared: namely, the 
physical results seem to depend on t,,; and have we not lost Lorentz covariance, 
because the theory involves a special 4-vector t,,? 

Setting these questions aside for the moment, we can calculate (13.122) 
using the conditions (13.123) and (13.124), finding (problem 13.11) 


Pav = —9uv — [t kuku — k - t(kyty + kyty)|/(k- t)?. (13.125) 


But only the first term on the right-hand side of (13.125) is to be seen in 
(13.121). A crucial quantity is clearly 


Uww(k,t) = —9uv — Pw 
K kuku — k- t(kuty + kytu)]/(k 02. (13.126) 


We note that whereas 
k" Paw =k’ Pw = 0 (13.127) 


(from the condition k - e = 0), the same is not true of k“U „y — in fact, 
RU, = —ky (13.128) 


where we have used k? = 0. It follows that U,,, may be regarded as including 
polarization states for which e - k # 0. In physical terms, therefore, a gluon 
appearing internally in a Feynman graph has to be regarded as existing in more 
than just the two polarization states available to an external gluon (cf section 
7.3.1). Up characterizes the contribution of these unphysical polarization 
states. 
The discrepancy between (13.121) and (13.119) is then 
21m M) aa = J ME, [U2 (ky, ti] MO), [U (ko, t2)]dp2, (13.129) 
together with similar terms involving one P and one U. It follows that these 
unwanted contributions will, in fact, vanish if 
ki MO, =0, (13.130) 


Hiv 


and similarly for k2. This will also ensure that amplitudes are independent of 
tu 

Condition (13.130) is apparently the same as the U(1) gauge invariance 
requirement of (8.165), already recalled in the previous section. As discussed 
there, it can be interpreted here also as expressing gauge invariance in the 
non-Abelian case, working to this given order in perturbation theory. Indeed, 
the diagrams of figure 13.11 are essentially ‘crossed’ versions of those in figure 
13.5. However, there is one crucial difference here. In figure 13.5, both the 
X’s were physical, their polarizations satisfying the condition e-k = 0. In 
figure 13.11, by contrast, neither of the gluons, in the discrepant contribution 


13.3. Local non-Abelian symmetries in Lagrangian quantum field theory 67 


(13.129), satisfies e - k = 0 — see the sentence following (13.128). Thus the 
crucial point is that (13.130) must be true for each gluon, even when the other 
gluon has e-k #0. And, in fact, we shall now see that whereas the (crossed) 
version of (13.130) did hold for our dX — dX amplitudes of section 13.3.2, 
(13.130) fails for states with e€- k 4 0. 

The three graphs of figure 13.11 together yield 


MO). (lea, Ares" (k2, A2) = “5(p 22 Ioa a ari giu(pı) 
z Ta 1 Te; 
gg 90 (p)-3 ar Pa în fu(p.) 
+ (—i)g?enig[(p1 + pa + k1)" g? + (ko — pa — pa) g” 
-1 T 
He (— ky + ka)? ger y, Q1;02j€2w, Arles) (13.131) 


where we have written the gluon polarization vectors as a product of a Lorentz 
4-vector €,, and an ‘SU(2) polarization vector’ a; to specify the triplet state 
label. Now replace €+, say, by kı. Using the Dirac equation for u(p1) and 
v(p2) the first two terms reduce to (cf (13.94)) 


*5(p2) golri /2, 7; /2)u(pi)ariar; 
= ig’0(p2) foci (Th/2)u(p1)a1:a2; (13.132) 


using the SU(2) algebra of the 7's. The third term in (13.131) gives 
—ig eijk U(p2) gol rr /2)u(p1)a1,43; (13.133) 


+ig? 


ie T U(p2) Kı(Tk/2)u(pı)kz - €2ariaay. (13.134) 


We see that the first part (13.133) certainly does cancel (13.132), but there 
remains the second piece (13.134), which only vanishes if ka - ea = 0. This is 
not sufficient to guarantee the absence of all unphysical contributions to the 
imaginary part of the 2-gluon graphs, as the preceding discussion shows. We 
conclude that loop diagrams involving two (or, in fact, more) gluons, if con- 
structed according to the simple rules for tree diagrams, will violate unitarity. 

The correct rule for such loops must be as to satisfy unitarity. Since there 
seems no other way in which the offending piece in (13.134) can be removed, 
we must infer that the rule for loops will have to involve some extra term, or 
terms, over and above the simple tree-type constructions, which will cancel 
the contributions of unphysical polarization states. To get an intuitive idea 
of what such extra terms might be, we return to expression (13.126) for the 
sum over unphysical polarization states U,,, and make a specific choice for 
t. We take t,, = ku, where the 4-vector k is defined by k = (— | k |, k), and 
k = (0,0,| & |). This choice obviously satisfies (13.124). Then 


Uy (k, k) = (kyky + kvky)/(2 | kl?) (13.135) 
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and unitarity (cf (13.129)) requires 


[me MO) (KE KE? AL A, 
pu 


d 13.136 
1⁄1 H2V2 2 | kı |2 2 | ko 2 p2 ( ) 


to vanish, but it does not. Let us work in the centre of momentum (CM) frame 
of the two gluons, with kı = (| k |,0,0,| k |), k2 = (| k |,0,0,— | k |), kı = 
(— | k |,0,0,| k |), k2 = (— | k |,0,0,— | k |), and consider for definiteness 
the contractions with the MBa term. These are MO, ki heats MO, ke o 
etc. Such quantities can be calculated from expression (13.131) by setting 
er = ki, €2 = ka for the first, e, = k1, o = ka for the second, and so on. We 
have already obtained the result of putting e, = kı. From (13.134) it is clear 
that a term in which e2 is replaced by ka as well as e, by kı will vanish, since 
k2 = 0. A typical non-vanishing term is of the form MY, kiki: /2 | k 12. 
From (13.134) this reduces to 


Eijk 


2k1 + ka 


ig? d(p2) Ki (Te/2)u(p1)ariaa; (13.137) 


using ka - ko/2 | k |?= —1. We may rewrite (13.137) as 


— g8 
juk RAI cia ka (13.138) 


where 
Juk = 9U(p2) yu (Tx /2)u(p1) (13.139) 


is the SU(2) current associated with the qq pair. 

The unwanted terms of the form (13.138) can be eliminated if we adopt 
the following rule (on the grounds of ‘forcing the theory to make sense”). 
In addition to the fourth-order diagrams of the type shown in figure 13.10, 
constructed according to the simple “tree” prescriptions, there must exist a 
previously unknown fourth-order contribution, only present in loops, such that 
it has an imaginary part which is non-zero in the same physical region as the 
two-gluon intermediate state, and moreover is of just the right magnitude to 
cancel all the contributions to (13.136) from terms like (13.138). Now (13.138) 
has the appearance of a one-gluon intermediate state amplitude. The qq > g 
vertex is represented by the current (13.139), the gluon propagator appears 
in Feynman gauge € = 1, and the rest of the expression would have the 
interpretation of a coupling between the intermediate gluon and two scalar 
particles with SU(2) polarizations a+;, a2j. Thus (13.138) can be interpreted 
as the amplitude for the tree graph shown in figure 13.12, where the dotted 
lines represent the scalar particles. It seems plausible, therefore, that the 
fourth-order graph we are looking for has the form shown in figure 13.13. 
The new scalar particles must be massless, so that this new amplitude has 
an imaginary part in the same physical region as the gg state. When the 
imaginary part of figure 13.13 is calculated in the usual way, it will involve 
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FIGURE 13.12 
Tree graph interpretation of the expression (13.138). 


ghost 


FIGURE 13.13 
Ghost loop diagram contributing in fourth order to qq — qq. 


contributions from the tree graph of figure 13.12, and these can be arranged 
to cancel the unphysical polarization pieces like (13.138). 

For this cancellation to work, the scalar particle loop graph of figure 13.13 
must enter with the opposite sign from the three-gluon loop graph of figure 
13.10, which in retrospect was the cause of all the trouble. Such a relative 
minus sign between single closed loop graphs would be expected if the scalar 
particles in figure 13.13 were in fact fermions! (Recall the rule given in section 
11.3 and problem 11.2). Thus we appear to need scalar particles obeying Fermi 
statistics. Such particles are called ‘ghosts’. We must emphasize that although 
we have introduced the tree graph of figure 13.12, which apparently involves 
ghosts as external lines, in reality the ghosts are always confined to loops, their 
function being to cancel unphysical contributions from intermediate gluons. 

The preceding discussion has, of course, been entirely heuristic. It can 
be followed through so as to yield the correct prescription for eliminating 
unphysical contributions from a single closed gluon loop. But, as Feynman 
recognized (1963, 1977), unitarity alone is not a sufficient constraint to provide 
the prescription for more than one closed gluon loop. Clearly what is required 
is some additional term in the Lagrangian, which will do the job in general. 
Such a term indeed exists, and was first derived using the path integral form 
of quantum field theory (see chapter 16) by Faddeev and Popov (1967). The 
result is that the covariant gauge-fixing term (13.68) must be supplemented 
by the ‘ghost Lagrangian’ 


Êg = Oui Dish; (13.140) 


2, 


where the y field is an SU(2) triplet, and spinless, but obeying anticommutation 
relations; the covariant derivative is the one appropriate for an SU(2) triplet, 
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namely (from (13.44) and (12.48)) 
Di = 048,5 + genig WE, (13.141) 


in this case. The result (13.140) is derived in standard books of quantum 
field theory, for example Cheng and Li (1984), Peskin and Schroeder (1995) 
or Ryder (1996). We should add the caution that the form of the ghost 
Lagrangian depends on the choice of the gauge-fixing term; there are gauges 
in which the ghosts are absent. Feynman rules for non-Abelian gauge field 
theories are given in Cheng and Li (1984), for example. We give the rules for 
tree diagrams, for which there are no problems with ghosts, in appendix Q. 
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Problems 

13.1 Verify that (13.34) reduces to (13.26) in the infinitesimal case. 
13.2 Verify equation (13.45). 

13.3 Using the expression for D in (13.47), verify (13.48). 


13.4 Verify the transformation law (13.51) of F*” under local SU(2) trans- 
formations. 


13.5 Verify that F „v - FP” is invariant under local SU(2) transformations. 


13.6 Verify that the (infinitesimal) transformation law (13.56) for the SU(3) 
gauge field A! is consistent with (13.55). 


13.7 By considering the commutator of two D“’s of the form (13.53), verify 
(13.61). 


13.8 Verify that (13.84) reduces to (13.86) (omitting the (27)*6* factors). 
13.9 Verify that the replacement of ea by kı in (13.93) leads to (13.94). 


13.10 Verify that when ea is replaced by kı in (13.95), the resulting amplitude 
cancels the contribution (13.94), provided that ô = 1. 


13.11 Show that P”” of (13.122), with the e’s specified by the conditions 
(13.123) and (13.124), is given by (13.125). 
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QCD I: Introduction, Tree Graph 
Predictions, and Jets 


In the previous chapter we have introduced the elementary concepts and for- 
malism associated with non-Abelian quantum gauge field theories. It is now 
well established that the strong interactions between quarks are described by 
a theory of this type, in which the gauge group is an SU(3)., acting on a 
degree of freedom called ‘colour’ (indicated by the subscript c). This theory 
is called Quantum Chromodynamics, or QCD for short. QCD will be our first 
application of the theory developed in chapter 13, and we shall devote the 
next two chapters, and much of chapter 16, to it. 


In the present chapter we introduce QCD and discuss some of its simpler 
experimental consequences. We briefly recall the evidence for the ‘colour’ de- 
gree of freedom in section 14.1, and then proceed to the dynamics of colour, 
and the QCD Lagrangian, in section 14.2. Perhaps the most remarkable thing 
about the dynamics of QCD is that, despite its being a theory of the strong 
interactions, there are certain kinematic regimes — roughly speaking, short dis- 
tances or high energies — in which it is effectively a quite weakly interacting the- 
ory. This is a consequence of a fundamental property, possessed only by non- 
Abelian gauge theories, whereby the effective interaction strength becomes 
progressively smaller in such regimes. This property is called ‘asymptotic 
freedom’, and was already mentioned in section 11.5.3 of volume 1. In appro- 
priate cases, therefore, the lowest-order perturbation theory amplitudes (tree 
graphs) provide a very convincing qualitative, or even ‘semi-quantitative’, ori- 
entation to the data. In sections 14.3 and 14.4 we shall see how the tree graph 
techniques acquired for QED in volume 1 produce more useful physics when 
applied to QCD. 


However, most of the quantitative experimental support for QCD has come 
from comparison with predictions which include higher-order QCD correc- 
tions; indeed, the asymptotic freedom property itself emerges from summing a 
whole class of higher-order contributions, as we shall indicate at the beginning 
of chapter 15. This immediately involves all the apparatus of renormalization. 
The necessary calculations quite rapidly become too technical for the intended 
scope of this book, but in chapter 15 we shall try to provide an elementary in- 
troduction to the issues involved, and to the necessary techniques, by building 
on the discussion of renormalization given in chapters 10 and 11 of volume 1. 
The main new concept will be the renormalization group (and related ideas), 
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which is an essential tool in the modern confrontation of perturbative QCD 
with data. Some of the simpler predictions of the renormalization group tech- 
nique will be compared with experimental data in the last part of chapter 
15. 

In chapter 16 we work towards understanding some non-perturbative as- 
pects of QCD. As a natural concomitant of asymptotic freedom, it is to be 
expected that the effective coupling strength becomes progressively larger at 
longer distances or lower energies, ultimately being strong enough to lead 
(presumably) to the confinement of quarks and gluons; this is sometimes re- 
ferred to as ‘infrared slavery’. In this regime perturbation theory clearly fails. 
An alternative, purely numerical, approach is available however, namely the 
method of ‘lattice’ QCD, which involves replacing the space-time continuum 
by a discrete lattice of points. At first sight, this may seem a topic rather 
disconnected from everything that has preceded it. But we shall see that in 
fact it provides some powerful new insights into several aspects of quantum 
field theory in general, and in particular of renormalization, by revisiting it in 
coordinate (rather than momentum) space. Quite apart from this, however, 
results from lattice QCD now provide independent confirmation of the theory, 
in the non-perturbative regime. 
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14.1 The colour degree of freedom 


The first intimation of a new, unrevealed degree of freedom of matter came 
from baryon spectroscopy (Greenberg 1964; see also Han and Nambu 1965, 
and Tavkhelidze 1965). For a baryon made of three spin-3 quarks, the original 
non-relativistic quark model wave-function took the form 


W3q = W3q,spaceY'3q,spin W3q, flavour (14.1) 


It was soon realized (e.g. Dalitz 1965) that the product of these space, spin 
and flavour wavefunctions for the ground state baryons was symmetric under 
interchange of any two quarks. For example, the A** state mentioned in 
section 12.2.3 is made of three u quarks (flavour symmetric) in the J? = 
îi state, which has zero orbital angular momentum and is hence spatially 
symmetric, and a symmetric S = 3 spin wavefunction. But we saw in section 
7.2 that quantum field theory requires fermions to obey the exclusion principle 
— i.e. the wavefunction 3q should be antisymmetric with respect to quark 
interchange. A simple way of implementing this requirement is to suppose 
that the quarks carry a further degree of freedom, called colour, with respect 
to which the 3q wavefunction can be antisymmetrized, as follows (Fritzsch 
and Gell-Mann 1972, Bardeen, Fritzsch and Gell-Mann 1973). We introduce 


a colour wavefunction with colour index a: 


Va (a=1,2,3). 
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We are here writing the three labels as ‘1, 2, 3’, but they are often referred to 
by colour names such as ‘red, blue, green’; it should be understood that this 
is merely a picturesque way of referring to the three basic states of this degree 
of freedom, and has nothing to do with real colour! With the addition of this 
degree of freedom we can certainly form a three-quark wavefunction which is 
antisymmetric in colour by using the antisymmetric symbol €q-, namely! 


W3q, colour — Eapy Papey (14.2) 


and this must then be multiplied into (14.1) to give the full 3q wavefunction. 
To date, all known baryon states can be described this way, i.e. the symmetry 
of the ‘traditional’ space-spin-flavour wavefunction (14.1) is symmetric overall, 
while the required antisymmetry is restored by the additional factor (14.2). As 
far as meson (qq) states are concerned, what was previously a TF wavefunction 
d*u is now i 

which we write in general as (1/v3)dł ua. We shall shortly see the group 
theoretical significance of this ‘neutral superposition’, and of (14.2). Mean- 
while, we note that (14.2) is actually the only way of making an antisymmetric 
combination of the three ~’s; it is therefore called a (colour) singlet. It is re- 
assuring that there is only one way of doing this — otherwise, we would have 
obtained more baryon states than are physically observed. As we shall see in 
section 14.2.1, (14.3) is also a colour singlet combination. 

The above would seem a somewhat artificial device unless there were some 
physical consequences of this increase in the number of quark types — and there 
are. In any process which we can describe in terms of creation or annihilation 
of quarks, the multiplicity of quark types will enter into the relevant observable 
cross section or decay rate. For example, at high energies the ratio 


Fe” — had 
R= a(ete” — hadrons) (14.4) 
a(ete~ — pty) 


will, in the quark parton model (see section 9.5), reflect the magnitudes of the 
individual quark couplings to the photon: 


Ri Dea (14.5) 


where a runs over all quark types. For five quarks u, d, s, c, b with respective 
charges 3, —3,—3,3,—3» this yields 


11 
Roo colour — g (14.6) 


ln (14.2) each y refers to a different quark, but we have not indicated the quark labels 
explicitly. 
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FIGURE 14.1 
The ratio R (see (14.4)). Figure reprinted with permission from L. Montanet 
et al. Physical Review D 50 1173 (1994). Copyright 1994 by the American 
Physical Society. 


and ii 
Recolour = 3 (14.7) 


for the two cases, as we saw in section 9.5. (The values R = 2 below the charm 
threshold, and R = 10/3 below the b threshold, were predicted by Bardeen et 
al. 1973). The data (figure 14.1) rule out (14.6), and are in good agreement 
with (14.7) at energies well above the b threshold, and well below the Z° 
resonance peak. There is an indication that the data tend to lie above the 
parton model prediction; this is actually predicted by QCD via higher-order 
corrections, as will be discussed in section 15.1. 

A number of branching fractions also provide simple ways of measuring 
the number of colours Ne. For example, consider the branching fraction for 
T — e Dev, (i.e. the ratio of the rate for 77 + e” Dev, to that for all other 
decays). t~ decays proceed via the weak process shown in figure 14.2, where 
the final fermions can be e De, u` Vu, or tid, the last with multiplicity Ne. 
Thus 


1 
B(T >€ Derr) = DEN. (14.8) 
Experiments give B ~ 18 % and hence N. = 3. 
Similarly, the branching fraction B(W- — ee) is ~ NM (from f = 


e, ,7,u and c). Experiment gives a value of 10.7 %, so again N. = 3. 

In chapter 9 we also discussed the Drell-Yan process in the quark parton 
model; it involves the subprocess qq — ll which is the inverse of the one in 
(14.4). We mentioned that a factor of 3 appears in this case: it arises because 
we must average over the nine possible initial qq combinations (factor 5) 
and then sum over the number of such states that lead to the colour neutral 
photon, which is 3 (41q1,42q2 and G3q3). With this factor, and using quark 
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FIGURE 14.2 
T decay. 


FIGURE 14.3 
Triangle graph for 7° decay. 


distribution functions consistent with deep inelastic scattering, the parton 
model gives a good first approximation to the data. 

Finally, we mention the rate for 7° — yy. As will be discussed in section 
18.4, this process is entirely calculable from the graph shown in figure 14.3 
(and the one with the y's ‘crossed’), where ‘q’ is u or d. The amplitude is 
proportional to the square of the quark charges, but because the 7° is an 
isovector, the contributions from the uú and dd states have opposite signs 
(see section 12.1.3). Thus the rate contains a factor 


(2/37 — (1/3)?)? = =. (14.9) 
However, the original calculation of this rate by Steinberger (1949) used a 
model in which the proton and neutron replaced the u and d in the loop, in 
which case the factor corresponding to (14.9) is just 1 (since the n has zero 
charge). Experimentally the rate agrees well with Steinberger’s calculation, 
indicating that (14.9) needs to be multiplied by 9, which corresponds to N. = 3 
identical amplitudes of the form shown in figure 14.3, as was noted by Bardeen, 
Fritzsch and Gell-Mann (1973). 
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eS ee 
14.2 The dynamics of colour 
14.2.1 Colour as an SU(3) group 


We now want to consider the possible dynamical role of colour — in other 
words, the way in which the forces between quarks depend on their colours. 
We have seen that we seem to need three different quark types for each given 
flavour. They must all have the same mass, or else we would observe some 
‘fine structure’ in the hadronic levels. Furthermore, and for the same reason, 
‘colour’ must be an exact symmetry of the Hamiltonian governing the quark 
dynamics. What symmetry group is involved? We shall consider how some 
empirical facts suggest that the answer is SU(3).. 

To begin with, it is certainly clear that the interquark force must depend 
on colour, since we do not observe ‘colour multiplicity’ of hadronic states: for 
example we do not see eight other coloured 7*’s (djuz, d3ui, ...) degenerate 
with the one ‘colourless’ physical 7+ whose wavefunction was given previ- 
ously. The observed hadronic states are all colour singlets, and the force must 
somehow be responsible for this. More particularly, the force has to produce 
only those very restricted types of quark configuration which are observed in 
the hadron spectrum. Consider again the isospin multiplets in nuclear physics 
discussed in section 12.1.2. There is one very striking difference in the par- 
ticle physics case: for mesons only T = 0,3 and 1 occur, and for baryons 
only T = 0, 3, l and 3, while in nuclei there is nothing in principle to stop 
us finding T = 3, 3, ...states. (In fact such nuclear states are hard to iden- 
tify experimentally, because they occur at high excitation energy for some of 
the isobars — cf figure 1.8(c) — where the levels are very dense). The same 
restriction holds for SU(3)¢ also — only 1’s and 8’s occur for mesons; and only 
1’s, 8’s and 10’s for baryons. In quark terms, this of course is what is trans- 
lated into the recipe: ‘mesons are qq, baryons are qqq’. It is as if we said, 
in nuclear physics, that only A = 2 and A = 3 nuclei exist! Thus the quark 
forces must have a dramatic saturation property: apparently no qqq, no qqqq, 
qqqqq, ...states exist. Furthermore, no qq or qq states exist either — nor, for 
that matter, do single q’s or q’s. All this can be summarized by saying that 
the quark colour degree of freedom must be confined, a property we shall now 
assume and return to in chapter 16. 

If we assume that only colour singlet states exist (Fritzsch and Gell-Mann 
1972, Bardeen, Fritzsch and Gell-Mann 1973), and that the strong interquark 
force depends only on colour, the fact that qq states are seen but qq and qq are 
not gives us an important clue as to what group to associate with colour. One 
simple possibility might be that the three colours correspond to the compo- 
nents of an SU(2). triplet ‘a’. The antisymmetric, colour singlet, three-quark 
baryon wavefunction of (14.2) is then just the triple scalar product Y, Y,xY3, 
which seems satisfactory. But what about the meson wavefunction? Mesons 
are formed of quarks and antiquarks, and we recall from sections 12.1.3 and 
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12.2 that antiquarks belong to the complex conjugate of the representation (or 
multiplet) to which quarks belong. Thus if a quark colour triplet wavefunction 
Va transforms under a colour transformation as 


Ya =>, = V Yg (14.10) 


where V(® is a 3 x 3 unitary matrix appropriate to the T = 1 representation 
of SU(2) (cf (12.48) and (12.49)), then the wavefunction for the ‘anti’-triplet 
is ws, which transforms as 


Ue > ve = VO Uh. (14.11) 


Given this information, we can now construct colour singlet wavefunctions for 
mesons, built from qq. Consider the quantity (cf (14.3)) Y, Viva where y* 
represents the antiquark and w the quark. This may be written in matrix 
notation as ia where the Yt as usual denotes the transpose of the complex 
conjugate of the column vector 4. Then, taking the transpose of (14.11), we 
find that Vi transforms by 


yt > yh = yivot (14.12) 
so that the combination Ytp transforms as 
yty = php =p VD = ply (14.13) 


where the last step follows since V™ is unitary (compare (12.58)). Thus the 
product is invariant under (14.10) and (14.11) — that is, it is a colour singlet, 
as required. This is the meaning of the superposition (14.3). 

All this may seem fine, but there is a problem. The three-dimensional 
representation of SU(2). which we are using here has a very special nature: 
the matrix V“ can be chosen to be real. This can be understood ‘physically’ 
if we make use of the great similarity between SU(2) and the group of rota- 
tions in three dimensions (which is the reason for the geometrical language of 
isospin ‘rotations’, and so on). We know very well how real three-dimensional 
vectors transform, namely by an orthogonal 3 x 3 matrix. It is the same in 
SU(2). It is always possible to choose the wavefunctions ~ to be real, and the 
transformation matrix V“ to be real also. Since VÚ) is, in general, unitary, 
this means that it must be orthogonal. But now the basic difficulty appears: 
there is no distinction between Y and w*! They both transform by the real 
matrix V“), This means that we can make SU(2) invariant (colour singlet) 
combinations for qq states, and for qq states, just as well as for qq states — 
indeed they are formally identical. But such ‘diquark’ (or ‘antidiquark’) states 
are not found, and hence — by assumption — should not be colour singlets. 

The next simplest possibility seems to be that the three colours corre- 
spond to the components of an SU(3). triplet. In this case the quark colour 
wavefunction Wa transforms as (cf (12.74)) 


p > yY = Ww (14.14) 
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where W is a special unitary 3 x 3 matrix parametrized as 
W = explia - A/2), (14.15) 


and Vi! transforms as 
pi — pl = pW. (14.16) 


The proof of the invariance of yty goes through as in (14.13), and it can be 
shown (problem 14.1(a)) that the antisymmetric 3q combination (14.2) is also 
an SU(3). invariant. Thus both the proposed meson and baryon states are 
colour singlets. It is not possible to choose the A's to be pure imaginary in 
(14.15), and thus the 3 x 3 W matrices of SU(3). cannot be real, so that there 
is a distinction between w and /*, as we learned in section 12.2. Indeed, it 
can be shown (see Carruthers 1966, chapter 3, Jones 1990, chapter 8, and also 
problem 14.1(b)) that, unlike the case of SU(2). triplets, it is not possible to 
form an SU(3). colour singlet combination out of two colour triplets qq or 
anti-triplets qq. Thus SU(3). seems to be a possible and economical choice 
for the colour group. 


14.2.2 Global SU(3). invariance, and ‘scalar gluons’ 


As stated above, we are assuming, on empirical grounds, that the only phys- 
ically observed hadronic states are colour singlets — and this now means sin- 
glets under SU(3).. What sort of interquark force could produce this dramatic 
result? Consider an SU(2) analogy again, the interaction of two nucleons be- 
longing to the lowest (doublet) representation of SU(2). Labelling the states 
by an isospin T, the possible T values for two nucleons are T = 1 (triplet) and 
T = 0 (singlet). We know of an isospin-dependent force which can produce a 
splitting between these states, namely VT 1-72, where the ‘1’ and ‘2’ refer to 
the two nucleons. The total isospin is T = $(71 + 72), and we have 


1 1 
T? = ri + 271-72 +73) = 734 271- 72 +3) (14.17) 
whence 
Ti: T2 = 2T? — 3. (14.18) 
In the triplet state T? = 2, and in the singlet state T? = 0. Thus 


(Ti: T2ə)r=1 = 1 (14.19) 
(Ti - T2)r=0 = 3 (14.20) 


and if V is positive the T = 0 state is pulled down. A similar thing happens 
in SU(3).. Suppose this interquark force depended on the quark colours via 
a term proportional to 

Ai: Az. (14.21) 
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Then, in just the same way, we can introduce the total colour operator 


1 
F = ¿A + A2), (14.22) 
so that 1 
F? = GA + 2A, :A2 + A3) (14.23) 
and 
Ar Ag =2F?—A?, (14.24) 
where A? = AS = A”, say. Here A? = 58L Aa)? is found (see (12.75)) to 


have the value 16/3 (the unit matrix being understood). The operator F? 
commutes with all components of A; and Ag (as T? does with Tı and T2) 
and represents the quadratic Casimir operator C2 of SU(3). (see section M.5 
of appendix M), in the colour space of the two quarks considered here. The 
eigenvalues of C2 play a very important role in SU(3)., analogous to that of the 
total spin/angular momentum in SU(2). They depend on the SU(3). repre- 
sentation: indeed, they are one of the defining labels of SU(3) representations 
in general (see section M.5). Two quarks, each in the representation 3e, com- 
bine to give a 6,-dimensional representation and a 3* (see problem 14.1(b), 
and Jones (1990) chapter 8). The value of C2 for the singlet 6, representation 
is 10/3, and for the 3% representation is 4/3. Thus the ‘A; - Aq’ interaction 
will produce a negative (attractive) eigenvalue -8/3 in the 3* states, but a 
repulsive eigenvalue +4/3 in the 6. states, for two quarks. 

The maximum attraction will clearly be for states in which F? is zero. 
This is the singlet representation 1.. Two quarks cannot combine to give 
a colour singlet state, but we have seen in section 12.2 that a quark and an 
antiquark can: they combine to give 1. and 8e. In this case (14.24) is replaced 
by 


1 
Ar: As = 2F? — Al + AS), (14.25) 


where ‘1’ refers to the quark and ‘2’ to the antiquark. Thus the ‘A, - Àz’ 
interaction will give a repulsive eigenvalue +2/3 in the 8. channel, for which 
Co = 3, anda ‘maximally attractive’ eigenvalue -16/3 in the 1. channel, for 
a quark and an antiquark. 

In the case of baryons, built from three quarks, we have seen that when 
two of them are coupled to the 3* state, the eigenvalue of Aq - Ag is -8/3, one 
half of the attraction in the gq colour singlet state, but still strongly attractive. 
The (qq) pair in the 3* state can then couple to the remaining third quark to 
make the overall colour singlet state (14.2), with maximum binding. 

Of course, such a simple potential model does not imply that the energy 
difference between the 1. states and all coloured states is infinite, as our 
strict ‘colour singlets only’ hypothesis would demand, and which would be 
one (rather crude) way of interpreting confinement. Nevertheless, we can ask: 
what single particle exchange process between quark (or antiquark) colour 
triplets produces a A; - Ag type of term? The answer is the exchange of 
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FIGURE 14.4 
Scalar gluon exchange between two quarks. 


an SU(3). octet (8.) of particles, which (anticipating somewhat) we shall call 
gluons. Since colour is an exact symmetry, the quark wave equation describing 
the colour interactions must be SU(3). covariant. A simple such equation is 


(i d—m)p = IA (14.26) 


where gs is a “strong charge’ and A, (a = 1, 2, ..., 8) is an octet of scalar 
‘gluon potentials’. Equation (14.26) may be compared with (13.58): in the 
latter, Aa appears on the right-hand side, because the gauge field quanta 
are vectors rather than scalars. In (14.26), we are dealing at this stage only 
with a global SU(3) symmetry, not a local SU(3) gauge symmetry, and so the 
potentials may be taken to be scalars, for simplicity. As in (13.60), the vertex 
corresponding to (14.26) is 

—i9sAa /2. (14.27) 


(14.27) differs from (13.60) simply in the absence of the y” factor, due to 
the assumed scalar, rather than vector, nature of the ‘gluon’ here. When we 
put two such vertices together and join them with a gluon propagator (figure 
14.4), the SU(3). structure of the amplitude will be 
Mag Aa AL Aa 
o D ES 
the dap arising from the fact that the freely propagating gluon does not change 
its colour. This interaction has exactly the required ‘A; - A2’ character in the 
colour space. 


(14.28) 


14.2.3 Local SU(3). invariance: the QCD Lagrangian 


It is tempting to suppose (Fritzsch and Gell-Mann 1972, Fritzsch, Gell-Mann 
and Leutwyler 1973) that the ‘scalar gluons’ introduced in (14.26) are, in fact, 
vector particles, like the photons of QED. Equation (14.26) then becomes 


(i J- my = pa Aay (14.29) 
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as in (13.58 ), and the vertex (14.27) becomes 


-ig tty" (14.30) 
as in (13.60). One motivation for this is the desire to make the colour dynamics 
as much as possible like the highly successful theory of QED, and to derive 
the dynamics from a gauge principle. As we have seen in the last chapter, this 
involves the simple but deep step of supposing that the quark wave equation 
is covariant under local SU(3). transformations of the form 


ww! =expligsa(x) - A/2)4. (14.31) 


This is implemented by the replacement 


On > On + ign A Ala) (14.32) 
in the Dirac equation for the quarks, which leads immediately to (14.29) and 
the vertex (14.30). 

Of course, the assumption of local SU(3). covariance leads to a great deal 
more: for example, it implies that the gluons are massless vector (spin 1) 
particles, and that they interact with themselves via three-gluon and four- 
gluon vertices, which are the SU(3). analogues of the SU(2) vertices discussed 
in section 13.3.2. The most compact way of summarizing all this structure is 
via the Lagrangian, most of which we have already introduced in chapter 13. 
Gathering together (13.71) and (13.140) (adapted to SU(3).), we write it out 
here for convenience: 


= pea A 1 a Auv 
Loop = y de aD — mp)agdr a — ¡FuwFa 
flavours f 
1 A 4 FAA 
= ¿(ONO AL) + Di (14.33) 


In (14.33), repeated indices are as usual summed over: a and 8 are SU(3).- 
triplet indices running from 1 to 3, and a, b are SU(3).-octet indices running 
from 1 to 8. The covariant derivatives are defined by 


y J al A 
(Dudas = Abas + i9s5(Aa)as Aan (14.34) 


when acting on the quark SU(3). triplet, as in (13.53), and by 
(Diab = Ou 0ab + Js feabÂcn (14.35) 


when acting on the octet of ghost fields. For the second of these, note that 
the matrices representing the SU(3) generators in the octet representation are 
as given in (12.84), and these take the place of the ‘A/2’ in (14.34) (compare 
(13.141) in the SU(2) case). We remind the reader that the last two terms 
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in (14.33) are the gauge-fixing and ghost terms, respectively, appropriate to 
a gauge field propagator of the form (13.69) (with ði; replaced by da, here). 
The Feynman rules following from (14.33) are given in appendix Q. 

As remarked in section 12.3.2, the fact that the QCD interactions (14.33) 
are ‘flavour-blind’ implies that the global flavour symmetries discussed in 
chapter 12 are all preserved by QCD. These include the conservation of each 
quark flavour (for example, the number of strange quarks minus the number 
of strange antiquarks is conserved); and the symmetries SU(2)f and SU(3)s, 
and the chiral symmetries SU(2)5f and SU(3)5¢, to the extent that these latter 
are good symmetries. Further, (14.33) conserves the discrete symmetries P, 
C and T, in a manner quite analogous to QED, already covered in section 7.5. 
In the case of P and T, the gluon fields Ags have the same transformation 
properties as the photon field Âp, and the (normally ordered) SU(3). currents 
TE = GeV $04 transform in the same way as the electromagnetic current 
qy"q, ensuring P and T invariance. Under C, the quark fields transform as 
usual according to (7.151). Charge conjugation for the gluon field needs a 
little more care. The required rule is 


CA Aan O A (14.36) 


The overall minus sign in (14.36) is analogous to that for the photon field 
(cf (7.152)). To understand the complex conjugate on the right-hand side of 
(14.36), recall from (7.153) that the complex scalar field $ = (d — 199) 


transforms according to 
(ĝi — iĝ2)ÔT! = by + ide. (14.37) 


Problem 14.2(a) verifies that the (normally ordered) interaction jf, Aq, is then 
C-invariant. As regards the term Fapy FẸ”, we can write it as 


1 A R 
z Tr Aa Fay Mo) (14.38) 


using the relation 
Tr(a àb) = 2940. (14.39) 


A short calculation (problem 14.2(b)) shows that Aa Fay transforms under 
C the same way as \qAq, (ie. according to (14.36)). Using the complex 
conjugate of (14.39), it then follows that (14.38) is invariant under C. 


14.2.4 The 0-term 


In arriving at (14.33) we have relied essentially on the ‘gauge principle” (in- 
variance under a local symmetry) and the requirement of renormalizability (to 
forbid the presence of terms with mass dimension higher than 4). The renor- 
malizability of such a theory was proved by ’t Hooft (1971a, b). However, 
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there is in fact one more gauge invariant term of mass dimension 4 which can 
be written down, namely 


P 092 
Êo = apes twee Bt v po, (14.40) 


this is the ‘6-term’ of QCD. A full discussion of this term (see for example 
Weinberg 1996, section 23.6) is beyond our scope, but we shall give a brief 
introduction to the main ideas. 

The reader may wonder, first of all, whether the 6-term should give rise 
to a new Feynman rule. The answer to this begins by noting that (14.40) can 
actually be written as a total divergence: 


Envpo F FP? = 3 R". (14.41) 


This is more easily seen in the analogous term for QED, namely Euvpo F w fpo, 
We have 


Espa FH” Ê? =  €Euvpo (aH Â” = av Ar) (8P A? = 9” A?) (14.42) 
= 4euvpoð Â” OP ÂF (14.43) 
3" (4e vpo Â OP A”), (14.44) 


where we have used the antisymmetry of the e symbol in (14.43), and also in 
(14.44) since the contraction of e with the symmetric tensor 040° vanishes. 
We shall not need the explicit form of Ke. 

Any total divergence in a Lagrangian can be integrated to give only a 
‘surface’ term in the action, which can usually be discarded, making conven- 
tional assumptions about the vanishing of the fields at spatial infinity. There 
are, however, field configurations (‘instantons’) which do contribute to the 
0-term. Such configurations are not reachable in perturbation theory, and so 
no perturbative Feynman rules are associated with (14.40). They approach 
a pure gauge form at spatial infinity, and are therefore associated with the 
QCD vacuum state; their effect is equivalent to including the term (14.40) in 
the QCD Lagrangian (see for example Rajaraman 1982). 

The term (14.40) has potentially important phenomenological implica- 
tions, since it conserves C but violates both P and T (and hence also CP). 
Again, this is easy to see in the QED analogue term (14. 42), which equals 
8E-B (problem 14.3): we recall that under P, E > —E and B > B, while 
under T, E > E and B > —B. But we know (section 4.2) that strong in- 
teractions conserve both P and T to a high degree of accuracy. In particular, 
the neutron electric dipole moment dn, which would violate both P and T, is 
extremely small (see (4.133)). A very crude estimate of the size of dn, induced 
by the 6-term, is given by dimensional analysis as 


e 
dn vO . 
Tall (14.45) 


n 
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where M, is the neutron mass. This would imply 6 < 10-12. In fact, this 
estimate is too restrictive, since it turns out (Weinberg 1996, section 23.6) 
that if any quark has zero mass, 0 can be reduced to zero by a global chiral 
U(1) transformation on that quark field. Although neither of the u and d 
quark masses are zero, they are small on a hadronic scale, and a suppression 
of (14.45) is expected, increasing the bound on theta. Estimates suggest 
6< 107° — 10-10. 

This may seem an unsatisfactorily special value to force on a dimensionless 
Lagrangian parameter, when there is nothing in the theory, a priori, to prevent 
something of order unity. This perceived difficulty is referred to as the ‘strong 
CP problem’. A possible solution to the problem, in which a very small value 
of 6 could arise naturally was suggested by Peccei and Quinn (1977a, 1977b). 
Their idea goes beyond the Standard Model, and involves the existence of a 
new very light pseudoscalar particle, the azion (Wilczek 1978, Winberg 1978). 

We proceed now with the main topic of this chapter, which is the applica- 
tion of perturbative QCD. 


E a 


14.3 Hard scattering processes, QCD tree graphs, and 
jets 


14.3.1 Introduction 


The fundamental distinctive feature of non-Abelian gauge theories is that they 
are ‘asymptotically free’, meaning that the effective coupling strength becomes 
progressively smaller at short distances, or high energies (Gross and Wilczek 
1973, Politzer 1973). This property is the most compelling theoretical motiva- 
tion for choosing a non-Abelian gauge theory for the strong interactions, and 
it enables a quantitative perturbative approach to be followed (in appropriate 
circumstances) even in strong interaction physics. This programme has in- 
deed been phenomenally successful, firmly establishing QCD as the theory of 
strong interactions, and now — in the era of the LHC — serving as a precision 
tool to guide searches for new physics. 

A proper understanding of how this works necessitates a considerable de- 
tour, however, into the physics of renormalization. In particular, we need to 
understand the important cluster of ideas going under the general heading of 
the ‘renormalization group’, and this will be the topic of chapter 15. For the 
moment we proceed with a discussion of some simple tree-level applications 
of QCD, which provided early confrontation of QCD with experiment. 

Let us begin by recapitulating, from a QCD-informed viewpoint, how 
the parton model successfully interpreted deep inelastic and large-Q? data 
in terms of almost free point-like partons — now to be identified with the QCD 
quanta: quarks, antiquarks, and gluons. 
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In section 9.5 we briefly introduced the idea of jets in ete” physics: two 
well collimated sprays of hadrons, apparently created as a quark—antiquark 
pair separate from each other at high speed. The angular distribution of 
the two jets followed closely the distribution expected from the parton-level 
process ete” — Gq. The dynamics at the parton level was governed by 
QED, but QCD is responsible for the way the emerging q and q turn them- 
selves into hadrons, a process called parton fragmentation (it occurs for glu- 
ons too). We may think of it as proceeding in two stages. First, as the 
rapidly moving q and q begin to separate, they develop perturbative show- 
ers of narrowly collimated gluons and quark—antiquark pairs. Then, as the 
partons separate further, the strength of the forces between them increases, 
becoming strongly non-perturbative at a separation of about 1 fm, and en- 
suring that the coloured quanta are all confined into hadrons. As yet we 
do not have a completely quantitative dynamical understanding of the sec- 
ond, hadronization, stage: it is implemented by means of a model. Nev- 
ertheless, we can argue that for the forces to be strong enough to produce 
the observed hadrons, the dominant processes in hadronization must involve 
small momentum transfers — that is, the exchange of ‘soft’ quanta. Thus the 
emerging hadrons are also well collimated into two jets, whose energy and 
angular distributions reflect the short-distance physics at the parton level. 
This simple 2-jet picture will be extended in section 14.4, where we consider 
ete” => 3 jets. 

A somewhat different aspect of parton physics arose in sections 9.2-9.3, 
where we considered deep inelastic electron scattering from nucleons. There 
the initial state contained one hadron. Correspondingly, one parton appeared 
in the initial state of the parton-level interaction, and the analysis required 
new functions measuring the probabilities of finding a particular parton in the 
parent hadron — the parton distribution functions. These too are beyond the 
reach of perturbation theory. 


We may also consider, finally, hadron-hadron collisions. In this case, we 
need all three of the features we have been discussing: the parton distribu- 
tion functions, to provide the intial parton-parton state from the two-hadron 
state; the perturbative short-distance parton-parton interaction; and the par- 
ton fragmentation process in the final state. These three parts to the process 
are pictured in figure 14.5. The identification and analysis of short distance 
parton-parton interactions provide direct tests of the tree-graph structure of 
QCD, and perturbative corrections to it. 

This three-part schematization of certain features of hadronic interactions 
is useful, because although we cannot yet calculate from first principles ei- 
ther the parton distribution functions or the fragmentation process, both are 
universal. The quark and gluon composition of hadrons is the same for all 
processes, and so measurements in one experiment can be used to predict 
the results of others. We saw an example of this in the Drell-Yan process of 
section 9.4. As regards the fragmentation stage, this too will be universal, pro- 
vided one is interested in sufficiently inclusive aspects of the final state. The 
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FIGURE 14.5 
Hadron-hadron collision involving parton-parton interaction followed by par- 
ton fragmentation. 


three-part scheme is called factorization, and it has been rigorously proved for 
some cases. We shall return to factorization in section 15.7. 

Let us turn now to some of the early data on parton-parton interactions 
in hadron-hadron collisions. 


14.3.2 Two-jet events in pp collisions 


How are short-distance parton-parton interactions to be identified experimen- 
tally? The answer is: in just the same way as Rutherford distinguished the 
presence of a small heavy scattering centre (the nucleus) in the atom: by look- 
ing at secondary particles emerging at large angles with respect to the beam 
direction. For each secondary particle we can define a transverse momentum 
pr = psin where p is the particle momentum and @ is the emission angle 
with respect to the beam axis. If hadronic matter were smooth and uniform 
(cf the Thomson atom), the distribution of events in pr would be expected 
to fall off very rapidly at large pr values — perhaps exponentially. This is 
just what is observed in the vast majority of events: the average value of pr 
measured for charged particles is very low ((pr) ~ 0.4 GeV), but in a small 
fraction of collisions the emission of high-pr secondaries is observed. They 
were first seen (Biisser et al. 1972, 1973, Alper et al. 1973, Banner et al. 
1982) at the CERN ISR (CMS energies 30-62 GeV), and were interpreted 
in parton terms as previously indicated. Referring to figure 14.5, a parton 
from one hadron undergoes a short-distance ‘hard scattering’ interaction with 
a parton from the other, leading in lowest-order perturbation theory to two 
wide-angle partons, which then fragment into two jets. 

We now face the experimental problem of picking out, from the enormous 
multiplicity of total events, just these hard scattering ones, in order to analyse 
them further. Early experiments used a trigger based on the detection of a 
single high-pr particle. But it turns out that such triggering really reduces 
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the probability of observing jets, since the probability that a single hadron in 
a jet will actually carry most of the jet’s total transverse momentum is quite 
small (Jacob and Landshoff 1978; Collins and Martin 1984, Chapter 5). It is 
much better to surround the collision volume with an array of calorimeters 
which measure the total energy deposited. Wide-angle jets can then be iden- 
tified by the occurrence of a large amount of total transverse energy deposited 
in a number of adjacent calorimeter cells: this is then a ‘jet trigger’. The 
importance of calorimetric triggers was first emphasized by Bjorken (1973), 
following earlier work by Berman, Bjorken and Kogut (1971). The applica- 
tion of this method to the detection and analysis of wide-angle jets was first 
reported by the UA2 collaboration at the CERN pp collider (Banner et al. 
1982). An impressive body of quite remarkably clean jet data was subse- 
quently accumulated by both the UA1 and UA2 collaborations (at ys = 546 
GeV and 630 GeV), and by the CDF and D0 collaborations at the FNAL 
Tevatron collider (ys = 1.8 TeV). 
For each event the total transverse energy » Er is measured where 


Y Er = 50 Ejsind;. (14.46) 


E, is the energy deposited in the ith calorimeter cell and 6; is the polar 
angle of the cell centre; the sum extends over all cells. Figure 14.6 shows the 
Y Er distribution observed by UA2: it follows the ‘soft’ exponential form for 
Y Er < 60 GeV, but thereafter departs from it, showing clear evidence of the 
wide-angle collisions characteristic of hard processes. 

As we shall see shortly, the majority of ‘hard’ events are of two-jet type, 
with the jets sharing the >) Er approximately equally. Thus a ‘local’ trigger 
set to select events with localized transverse energy > 30 GeV and/or a ‘global’ 
trigger set at > 60 GeV can be used. At ys > 500-600 GeV there is plenty 
of energy available to produce such events. 

The total \/s value is important for another reason. Consider the kinemat- 
ics of the two-parton collision (figure 14.5) in the pp CMS. As in the Drell-Yan 
process of section 9.4, the right-moving parton has 4-momentum 


xıpı = 21(P, 0,0, P) (14.47) 
and the left-moving one 
mapa = t2(P, 0,0, —P) (14.48) 


where P = ,/s/2 and we are neglecting parton transverse momenta, which 
are approximately limited by the observed (pr) value (~ 0.4 GeV, and thus 
negligible on this energy scale). Consider the simple case of 90° scattering, 
which requires (for massless partons) zı = 22, equal to x say. The total 
outgoing transverse energy is then 2xP = xys. If this is to be greater than 
50 GeV, then partons with x > 0.1 will contribute to the process. The parton 
distribution functions are large at these relatively small x values, due to sea 
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FIGURE 14.6 
Distribution of the total transverse energy ` Er observed in the UA2 central 
calorimeter (DiLella 1985). 


quarks (section 9.3) and gluons (figure 9.9), and thus we expect to obtain a 
reasonable cross section. 

What are the characteristics of jet events? When > Er is large enough 
(> 150 GeV), it is found that essentially all of the transverse energy is indeed 
split roughly equally between two approximately back-to-back jets. A typical 
such event is shown in figure 14.7. Returning to the kinematics of (14.47) 
and (14.48), x; will not in general be equal to x2, so that — as is apparent in 
figure 14.7 — the jets will not be collinear. However, to the extent that the 
transverse parton momenta can be neglected, the jets will be coplanar with 
the beam direction, i.e. their relative azimuthal angle will be 180°. Figure 
14.8 shows a number of examples in which the distribution of the transverse 
energy over the calorimeter cells is analyzed as a function of the jet opening 
angle 0 and the azimuthal angle ¢. It is strikingly evident that we are seeing 
precisely a kind of ‘Rutherford’ process, or — to vary the analogy — we might 
say that hadronic jets are acting as the modern counterpart of Faraday’s iron 
filings, in rendering visible the underlying field dynamics! 

We may now consider more detailed features of these two-jet events — in 
particular, the expectations based on QCD tree graphs. The initial hadrons 
provide wide-band beams of quarks, antiquarks and gluons”; thus we shall 
have many parton subprocesses, such as qq > qq, qq > qq, qq > gg, gg > gg, 
etc. The most important, numerically, for a pp collider are qq > qq, gq > gq 


2In the sense that the partons in hadrons have momentum or energy distributions, which 
are characteristic of their localization to hadronic dimensions. 
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FIGURE 14.7 

Two-jet event. Two tightly collimated groups of reconstructed charged tracks 
can be seen in the cylindrical central detector of UAI, associated with two 
large clusters of calorimeter energy depositions. Figure reprinted with per- 
mission from S Geer in High Energy Physics 1985, Proc. Yale Advanced Study 
Institute eds M J Bowick and F Gursey; copyright 1986 World Scientific Pub- 
lishing Company. 


FIGURE 14.8 

Four transverse energy distributions for events with X`, Er > 100 GeV, in 
the 6,¢ plane (UA2, DiLella 1985). Each bin represents a cell of the UA2 
calorimeter. Note that the sum of the ¢’s equals 180° (mod 360°). 
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TABLE 14.1 
Spin-averaged squared matrix elements for one-gluon exchange (f-channel) 
processes. 

Subprocess |M]? 


qq > qq 4 (52) 
qq > qq t? 


qg > qg + 


and gg — gg. The cross section will be given, in the parton model, by a 
formula of the Drell-Yan type, except that the electromagnetic annihilation 
cross section 

o(qq > wtp) = 4ra? /3q? (14.49) 


is replaced by the various QCD subprocess cross sections, each one being 
weighted by the appropriate distribution functions. At first sight this seems to 
be a very complicated story, with so many contributing parton processes. But 
a significant simplification comes from the fact that in the CMS of the parton 
collision, all processes involving one gluon exchange will lead to essentially the 
same dominant angular distribution of Rutherford-type, ~ sin 40 /2, where 0 
is the parton CMS scattering angle (recall section 1.3.6). This is illustrated 
in table 14.1 (taken from Combridge et al. 1977), which lists the different 
relevant spin averaged, squared, one-gluon-exchange matrix elements | M |?, 
where the parton differential cross section is given by (cf (6.129)) 


do _ na? 
dcos@ 28 


IMP. (14.50) 


Here as = g2/4x, and 8, f and ú are the subprocess invariants, so that 
§= (1p, + topo)? = x1228 (cf (9.84)). (14.51) 


Continuing to neglect the parton transverse momenta, the initial parton con- 
figuration shown in figure 14.5 can be brought to the parton CMS by a Lorentz 
transformation along the beam direction, the outgoing partons then emerging 
back-to-back at an angle @ to the beam axis, so Ê x (1—cos 0) œ sin? 0/2. Only 
the terms in (¢)~? ~ sin 4/2 are given in table 14.1. We note that the ŝ, f, û 
dependence of these terms is the same for the three types of process (and is in 
fact the same as that found for the 1y exchange process e pu — eu: see 
problem 8.17, converting do/dt into do/dcos0). Figure 14.9 shows the two 
jet angular distribution measured by UAI (Arnison et al. 1985). The broken 
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FIGURE 14.9 
Two-jet angular distribution plotted against cos (Arnison et al. 1985). 


curve is the exact angular distribution predicted by all the QCD tree graphs 
— it actually follows the sin”* 9/2 shape quite closely. 

It is interesting to compare this angular distribution with the one predicted 
on the assumption that the exchanged gluon is a spinless particle, so that the 
vertices have the form ‘uu’ rather than “uy,u'. Problem 14.4 shows that in 
this case the 1/ factor in the cross section is completely cancelled, thus ruling 
out such a model. 

This analysis provides compelling evidence for elementary hard scatter- 
ing events proceeding via the exchange of a massless vector quantum. It is 
possible to go much further. Anticipating our later discussion, the small dis- 
crepancy between ‘tree graph’ theory (which is labelled ‘leading order QCD 
scaling curve’ in figure 14.9) and experiment can be accounted for by includ- 
ing corrections which are of higher order in Qs. The solid curve in figure 14.9 
includes QCD corrections beyond the tree level, involving the ‘running’ of the 
coupling constant a, and ‘scaling violation’ in the effective parton distribu- 
tion functions, both of which effects will be discussed in the following chapter. 
The corrections lead to good agreement with experiment. 

The fact that the angular distributions of all the subprocesses are so similar 
allows further information to be extracted from these two-jet data. In general, 
the parton model cross section will have the form (cf (9.91)) 


(14.52) 


d?o z F,a(xı1) Fp(z2) dOab-ca 
dzıdzəd cos 0 ~ zi T2 oa dcos@ 

whereF, (11)/x1 is the distribution function for partons of type ‘a’ (q, q or g), 

and similarly for Fi,(a2)/x2. Using the near identity of all da/dcos6’s, and 
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FIGURE 14.10 

Effective distribution function measured from two-jet events (Arnison et al. 
1984 and Bagnaia et al. 1984). The broken and chain curves are obtained 
from deep inelastic neutrino scattering. Taken from DiLella (1985). 


noting the numerical factors in table 14.1, the sums over parton types reduce 
to 


Toles) + lale) + aleaga) + lalea) + ae) (14,53) 


where g(x), g(x) and g(a) are the gluon, quark and antiquark distribution 
functions. Thus effectively the weighted distribution function? 

F(a 4 _ 

TO L g(a) + State) + ala) (14.54) 
is measured (Combridge and Maxwell, 1984); in fact, with the weights as in 


(14.53), 
Bo _ Fix) Fez) doze (14.55) 
dx dx>d cos Ti T2 dcos 0 
xı and xa are kinematically determined from the measured jet variables: from 


(14.51), 


1112 =8/s (14.56) 
where 3 is the invariant [mass]? of the two-jet system and 
zi — x2 = 2PL/vVs (cf (9.82) (14.57) 


3The $ reflects the relative strengths of the quark-gluon and gluon-gluon couplings in 
QCD; see problem 14.5. 
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FIGURE 14.11 

The gluon distribution function g(x) extracted from the effective distribu- 
tion function F(x) by subtracting the expected contribution from the quarks 
and antiquarks. Figure reprinted with permission from S Geer in High En- 
ergy Physics 1985, Proc. Yale Theoretical Advanced Study Institute, eds M J 
Bowick and F Gursey; copyright 1986 World Scientific Publishing Company. 


with P, the total two-jet longitudinal momentum. Figure 14.10 shows F(1)/x 
obtained in the UAI (Arnison et al. 1984) and UA(2) (Bagnaia et al. 1984) 
experiments. Also shown in this figure is the expected F(x)/x based on con- 
temporary fits to the deep inelastic neutrino scattering data at Q? = 20 GeV? 
and 2000 GeV? (Abramovicz et al. 1982a,b, 1983); the reason for the change 
with Q? will be discussed in section 15.6. The agreement is qualitatively very 
satisfactory. Subtracting the distributions for quarks and antiquarks as found 
in deep inelastic lepton scattering, UA1 were able to deduce the gluon struc- 
ture function g(x) shown in figure 14.11. It is clear that gluon processes will 
dominate at small x — and even at larger x will be important because of the 
colour factors in table 14.1. 


14.3.3 Three-jet events in pp collisions 


Although most of the high- Er events at hadron colliders are two-jet events, 
in some 10-30% of the cases the energy is shared between three jets. An 
example is included as (d) in the collection of figure 14.8; a clearer one is 
shown in figure 14.12. In QCD such events are interpreted as arising from 
a 2 parton > 2 parton + 1 gluon process of the type gg — ggg, gq > ggq, 
etc. Once again, one can calculate (Kunszt and Piétarinen 1980, Gottschalk 
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FIGURE 14.12 

Three-jet event in the UAI detector, and the associated transverse energy flow 
plot. Figure reprinted with permission from S Geer in High Energy Physics 
1985, Proc. Yale Theoretical Advanced Study Institute, eds M J Bowick and 
F Gursey; copyright 1986 World Scientific Publishing Company. 
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FIGURE 14.13 
Some tree graphs associated with three-jet events. 


and Sivers 1980, Berends et al. 1981) all possible contributing tree graphs, 
of the kind shown in figure 14.13, which should dominate at small as. They 
are collectively known as QCD single-bremsstrahlung diagrams. Analysis of 
triple jets which are well separated both from each other and from the beam 
directions shows that the data are in good agreement with these lowest-order 
QCD predictions. For example, figure 14.14 shows the production angular 
distribution of UA2 (Appel et al. 1986) as a function of cos6*, where 6* is 
the angle between the leading (most energetic) jet momentum and the beam 
axis, in the three-jet CMS. It follows just the same sin~* 6*/2 curve as in the 
two-jet case (the data for which are also shown in the figure), as expected 
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FIGURE 14.14 

The distribution of cos0*(e), the angle of the leading jet with respect to 
the beam line (normalized to unity at cos9* = 0), for three-jet events in 
pp collisions (Appel et al. 1986). The distribution for two-jet events is also 
shown (o). The full curve is a parton model calculation using the tree graph 
amplitudes for gg — ggg, and cut-offs in transverse momentum and angular 
separation to eliminate divergences (see remarks following equation (14.73)). 


for massless quantum exchange; the particular curve is for the representative 
process gg — ggg. 

Another qualitative feature is that the ratio of three-jet to two-jet events 
is controlled, roughly, by as (compare figure 14.13 with the graphs in table 
14.1). Thus an estimate of a, can be obtained by comparing the rates of 
3-jet to 2-jet events in pp collisions. Other interesting predictions concern 
the characteristics of the 3-jet final state (for example, the distributions in 
the jet energy variables). At this point, however, it is convenient to leave pp 
collisions and consider instead 3-jet events in ete” collisions, for which the 
complications associated with the initial state hadrons are absent. 


(Me 
14.4 3-jet events in ete” annihilation 


Three-jet events in ete” collisions originate, according to QCD, from gluon 
bremsstrahlung corrections to the two-jet parton level process ete” > y* => 
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FIGURE 14.15 
Gluon brehmsstrahlung corrections to two-jet parton level process. 


qq, as shown in figure 14.15.4 This phenomenon was predicted by Ellis et al. 
(1976) and subsequently observed by Brandelik et al. (1979) with the TASSO 
detector at PETRA, and Barber et al. (1979) with MARK-J at PETRA, 
thus providing early encouragement for QCD. The situation here is in many 
ways simpler and cleaner than in the pp case; the initial state ‘partons’ are 
perfectly physical QED quanta, and their total 4-momentum is zero, so that 
the three jets have to be coplanar; further, there is only one type of diagram 
compared to the large number in the pp case, and much of that diagram 
involves the easier vertices of QED. Since the calculation of the cross section 
predicted from figure 14.15 is relevant not only to three-jet production in ete” 
collisions, but also to a satisfactory definition of the two-jet production cross 
section, to QCD corrections to the total eFe” annihilation cross section, and 
to scaling violations in deep inelastic scattering as well, we shall now consider 
it in some detail. It is important to emphasize at the outset that quark masses 
will be neglected in this calculation. 


14.4.1 Calculation of the parton-level cross section 


The quark, antiquark and gluon 4-momenta are pi, p2 and p3 respectively, as 
shown in figure 14.15; the e” and et 4-momenta are k and ky. The cross 
section is then (cf (6.110) and (6.112)) 


E a | Maag |? dpi d*p2 dps 
= — pi — po — p) DEL E SP3 (l. 
do Ems? (kı + k2 — pı — po — p3) 20? DE, QE, 2Es (14.58) 
where (neglecting all masses) 
Cal? Ys àc (pit ps) 
E 5 u = Ac APIT P3), 
Maag Q? v(k2)y“u(k1) (a 5) 2p1 ps Ypv(p2) 
E Ac + 7 

tus a w(rn)) (Na. (14.59) 


and Q? = 4E? is the square of the total ete” energy, and also the square of 
the virtual photon's 4-momentum Q, and ea (in units of e) is the charge of a 


4This is assuming that the total eFe” energy is far from the Z° mass; if not, the contri- 
bution from the intermediate Z° must be added to that from the photon. 
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quark of type ‘a’. Note the minus sign in (14.59): the antiquark coupling is 
—gs. In (14.59), e*"(A) is the polarization vector of the outgoing gluon with 
polarization A; a. is the colour wavefunction of the gluon (c = 1...... 8), 
and A. is the corresponding Gell-Mann matrix introduced in section 12.2; the 
colour parts of the q and q wavefunctions are understood to be included in 
the u and v factors; and (~i+ £3)/2p1 - pa is the virtual quark propagator 
(cf (L.6) in appendix L of volume 1) before gluon radiation, and similarly for 
the antiquark. Since the colour parts separate from the Dirac trace parts, we 
shall ignore them to begin with, and reinstate the result of the colour sum 
(via problem (14.5)) in the final answer (14.73). 

Averaging over e+ spins and summing over final state quark spins and 
gluon polarization A (using (8.171), and noting the discussion after (13.93)), 
we obtain (problem 14.6) 


1 2 ere. pu 
i Y [Mal or (kı, ko) Hv (p1, P2, p3) (14.60) 


spins, 


where the lepton tensor is, as usual (equation (8.119)), 


DH” (ky, k2) = 2(KÍk3 + k{ ký — kı - kag”) (14.61) 
and the hadron tensor is 
1 
Huwv(p1,P2,P3) = RED [Luv (p2, p3) — Lyv(p1,p1) + Lyv(p1, p2)] 
1 
+ Lyv (Pi, p3) — Luv (pe, p 
TET. uv (Da p3) pv (Pa 2) 


+ Lu (pi, p2)] 
T Pi? lp, ps) F Luv (p1, p3) 
(pı + ps) (P2 - ps) 
T Lv (p2, p3)] (14.62) 


Combining (14.61) and (14.62) allows complete expressions for the five-fold 
differential cross section to be obtained (Ellis et al. 1976). 

For the subsequent discussion it will be useful to integrate over the three 
angles describing the orientation (relative to the beam axis) of the produc- 
tion plane containing the three partons. After this integration, the (doubly 
differential) cross section is a function of two independent Lorentz invariant 
variables, which are conveniently taken to be two of the three s;; defined by 


Sij = (pi + py). (14.63) 


Since we are considering the massless case p? = 0 throughout, we may also 
write 


These variables are linearly related by 


2(p1 + p2 + p2- p3 + p3 ` p1) = Q? (14.65) 
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FIGURE 14.16 
Virtual photon decaying to qqg. 


as follows from 
(pi + pa + ps)? = Q? (14.66) 


and p? = 0. The integration yields (Ellis et al. 1976, 1977) 
ds 2 1 20? 
——— = leelo (= a a a2) (14.67) 
dsışds23 3 (02) N s23 s13  $13823 
where as = g2 /4r. 

We may understand the form of this result in a simple way, as follows. It 
seems plausible that after integrating over the production angles, the lepton 
tensor will be proportional to Q?g*”, all directional knowledge of the k; having 
been lost. Indeed, if we use —g”” Lau (p, p) = 4p- p' together with (14.62) we 
easily find that 


1 . . . 2 s s 2Q?s 
SEG i = bi Pa, P2P3, _ Pi p2Q LEB 528. Q 2 
4 P2:P3  P1'P3 (pi + p3)(pa - p3) $23 $13 $13523 
(14.68) 


exactly the factor appearing in (14.67). In turn, the result may be given 
a simple physical interpretation. From (7.118) we note that we can replace 
—g'” by oy, (A e”*(A’) for a virtual photon of polarization A”, the X = 0 
state contributing negatively. Thus effectively the result of doing the angular 
integration is (up to constants and Q? factors) to replace the lepton factor 
v(ke)y"ulki) by —ie"(X’), so that Mogg is proportional to the x — qăg 
processes shown in figure 14.16. But these are basically the same amplitudes 
as the ones we already met in Compton scattering (section 8.6). To compare 
with section 8.6.3, we convert the initial state fermion (electron/quark) into 
a final state antifermion (positron/antiquark) by p + —p, and then identify 
the variables of figure 14.16 with those of figure 8.14 (a) by 


popi k >p =P > p2 s — 2pı ; P3 = $13 
t + 2p1 + pa = 812 u > 2p2-p3 = $23. (14.69) 
Remembering that in (8.181) the virtual y had squared 4-momentum —Q?, 


we see that the Compton > | M |?’ of (8.181) indeed becomes proportional 
to the factor (14.68), as expected. 
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Xx constant 


FIGURE 14.17 
The kinematically allowed region in (x;) is the interior of the equilateral tri- 
angle. 


14.4.2 Soft and collinear divergences 


In three-body final states of the type under discussion here it is often conve- 
nient to preserve the symmetry between the s;,;’s and use three (dimensionless) 
variables x; defined by 


893 = Q2(1 — 21) and cyclic permutations. (14.70) 
These are related by (14.65), which becomes 
11 + T2 + T3 = 2. (14.71) 


An event with a given value of the set x; can then be plotted as a point in an 
equilateral triangle of height 1, as shown in figure 14.17. In order to find the 
limits of the allowed physical region in this x; space, we now transform from 
the overall three-body CMS to the CMS of 2 and 3 (figure 14.18). If O is the 
angle between 1 and 3 in this system, then (problem 14.7) 


Tao = (1 az x1/2) + (21/2) cos 6 
z3 = (1-2x1/2)- (21/2) cosé. (14.72) 
The limits of the physical region are then clearly cos @ = +1, which correspond 


to za = 1 and z3 = 1. By symmetry, we see that the entire perimeter of the 
triangle in figure 14.17 is the required boundary: physical events fall anywhere 
inside the triangle. (This is the massless limit of the classic Dalitz plot, first 
introduced by Dalitz (1953) for the analysis of K — 37.) Lines of constant 0 
are shown in figure 14.17. 
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FIGURE 14.18 
Definition of 6. 


Now consider the distribution provided by the QCD bremsstrahlung pro- 
cess, equation (14.67), which can be written equivalently as 


do 205 zi ZN r 
Me E AR ee 14. 
dzidza "Pt 3m (z =x) (1 — =) id 


where apt is the pointlike ete” — hadrons total cross section of (9.99), and 
a factor of 4 has been introduced from the colour sum (problem 14.5). The 
factor in large parentheses is (14.68) written in terms of the x; (problem 14.8). 
The most striking feature of (14.73) is that it is infinite as za or xa, or both, 
tend to 1 — and in such a way that the cross section integrated over xı and 
x2 diverges logarithmically. 

This is a quite different infinity from the ones encountered in the loop 
integrals of chapters 10 and 11. No integral over an arbitrarily large internal 
momentum is involved here — the tree amplitude itself is becoming singular on 
the phase space boundary. We can trace the origin of the singularity back to 
the denominator factors (pı -p3)~! ~ (1—@2)~! and (p2+p3)7! (1-21)? 
in (14.59). These become zero in two distinct configurations of the gluon 
momentum: 


(i) psp or p3 x po (using p? = 0) (14.74) 
(ii) ps 30 (14.75) 


which are easily interpreted physically. Condition (i) corresponds to a sit- 
uation in which the 4-momentum of the gluon is parallel to that of either 
the quark or the antiquark; this is called a ‘collinear divergence’ and the 
configuration is pictured in figure 14.19(a). If we restore the quark masses, 
pi = m? 4 0 and pă = m3 70, then the factor (2p; - p3)~', for example, be- 
comes ((p1 + p3)? — m?)~+ which only vanishes as p3 — 0, which is condition 
(ii). The divergence of type (i) is therefore also termed a ‘mass singularity’, 
as it would be absent if the quarks had mass. Condition (ii) corresponds to 
the emission of a very ‘soft’ gluon (figure 14.19(b)) and is called a ‘soft, or 
infrared, divergence’. In contrast to this, the gluon momentum p3 in type (i) 
does not have to be vanishingly small. 
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(a) (b) 


FIGURE 14.19 

Gluon configurations leading to divergences of equation (14.73): (a) gluon 
emitted approximately collinear with quark (or antiquark): (b) soft gluon 
emission. The events are viewed in the overall CMS. 


It is apparent from these figures that in either of these two cases the 
observed final state hadrons, after the fragmentation process, will in fact re- 
semble a two-jet configuration. Such events will be found in the regions 11 =1 
and/or x2 = 1 of the kinematical plot shown in figure 14.17, which correspond 
to strips adjacent to two of the boundaries of the triangle. Events outside these 
strips should be essentially three-jet events, corresponding to the emission of 
a hard, non-collinear gluon. To isolate such events, we must keep away from 
the boundaries of the triangle (the strip along the third boundary x3 = 1 will 
not contain a divergence, but will be included in a physical jet measure — see 
the following section). Thus to order a2a the total annihilation cross section 
to three jets is given by the integral of (14.73) over a suitably defined inner 
triangular region in figure 14.17. 

Assuming such a separation of three- and two-jet events can be done sat- 
isfactorily (see the next section), their ratio carries important information — 
namely, it should be proportional to as. This follows simply from the extra 
factor of gs associated with the gluon emissions in figure 14.15. Glossing over 
a number of technicalities (for which the reader is referred to Ellis, Stirling 
and Webber 1996, section 3.3), we show in figure 14.20 a compilation of data 
on the fraction of three-jet events at different ete” annihilation energies. The 
most remarkable feature of this figure is, of course, that this fraction — and 
hence as — changes with energy, decreasing as the energy increases. This is, in 
fact, direct evidence for asymptotic freedom. A more recent comparison be- 
tween theory and experiment (the agreement is remarkable) will be presented 
in the following chapter, section 15.3, after we have introduced the theoretical 
framework for calculating the energy dependence of as. 


SS ee 


14.5 Definition of the two-jet cross section in ete 
annihilation 


As just noted, the integral of (14.73) over the remaining regions of figure 14.17, 
near the phase-space boundaries, will contribute to the two-jet annihilation 
cross section — and it is divergent. Clearly this is not a physically acceptable 
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Energy dependence of three jet production 
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FIGURE 14.20 
A compilation of three-jet fractions at different ete” annihilation energies. 
Adapted from Akrawy et al. (OPAL) (1990); figure from R K Ellis, W J Stir- 
ling and B R Webber (1996) QCD and Collider Physics, courtesy Cambridge 
University Press. 


result: we want a finite two-jet cross section. The cure lies in recognizing 
that at the order to which we are working, namely a%as, other parton-level 
graphs can contribute. These are the one-gluon loop graphs shown in figure 
14.21, which are of order œas. They turn out to contain exactly the same soft 
and collinear divergences, this time associated with configurations of virtual 
momenta inside the loops. In a carefully defined two-jet cross section, these 
two classes of divergences (one from real gluon emission, the other from virtual 
gluons) actually cancel. 

Let us call the amplitude for the sum of these three graphs Fy,, where ‘vg’ 
stands for virtual gluon. Fy, is the order as correction to the original order 
a parton-level graph of figure 9.17, shown here again in figure 14.22, with 
amplitude F}. The cross section from these contributions is proportional to 
|F; + Fygl?. There are three terms in this expression: one of order a”, from 
|F,|?; another of order a?a?, from |Fyg|?, which we drop since it is of higher 
order in as; and an interference term of order a2a, the same as (14.73). 
Thus the interference term must be included in calculating the two-jet cross 
section to this order. When it is, the soft and collinear divergences cancel: 
the resulting two-jet cross section is IRC (infrared and collinear) ‘safe’. 


5The usual ultraviolet divergences in the loop graphs are removed by conventional renor- 
malization. 
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FIGURE 14.21 
Virtual gluon corrections to figure 14.20. 
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FIGURE 14.22 
One-photon annihilation amplitude in ete” > qq. 


This result was first shown by Sterman and Weinberg (1977), in a paper 
which initiated the modern treatment of jets within the framework of QCD. 
They defined the two-jet differential cross section to include those events in 
which all but a fraction e of the total ete” energy E (= \/Q?) is emitted 
within some pair of oppositely directed cones of half-angle 6 < 1, lying at an 
angle 0 to the ete” beam line. Including the contributions of real and virtual 
gluons up to order aas, the result is (Muta 2010, section 5.4.1) 


do do 4 as m2 5 
— = | — —-— 2 —-- 14. 
(22) = (33) [i$ (amos amsinacs—2)], caza 


where ($2) pt is the contribution of the lowest-order graph, figure 14.22, which 


is given by equation (9.102) summed over quark colours and charges; terms of 
order 6 and e, and higher powers, are neglected. It is evident from (14.76) that 
the jet parameters e and 6 serve to control the soft and collinear divergences, 
which reappear as e and 6 tend to zero; they are ‘resolution parameters’. 
The remarkable cancellation of the soft and collinear divergences between 
the real and virtual emission processes is actually a general result in QED 
(recall that in chapter 11 we declined to pursue the problem of such infrared 
divergences). The Bloch—Nordsieck (1937) theorem states that ‘soft’ singu- 
larities cancel between real and virtual processes when one adds up all final 
states which are indistinguishable by virtue of the energy resolution of the 
apparatus. The Kinoshita (1962) Lee and Nauenberg (1964) theorem states, 
roughly speaking, that mass singularities are absent if one adds up all indis- 
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tinguishable mass-degenerate states. This is the reason for the finiteness of 
the Sterman- Weinberg 2-jet cross section, in an analogous QCD case. 

Returning to (14.76), it is important to note that the angular distribu- 
tion of this well-defined two-jet process is given precisely by the lowest-order 
expression (9.102), just as was hoped in the simple parton model of section 
9.5. Of course, the cross section depends on the jet parameters 6 and e. The 
formula (14.76) can be used, for example, to estimate the angular radius of 
the jets, as a function of E. 

Although the Sterman-Weinberg jet definition was historically the first, 
it is not the only possible one. Another, in some ways simpler, definition 
(Kramer and Lampe 1987) is directly phrased in terms of the offending de- 
nominators sj and s3% in (14.67). Let us introduce the dimensionless jet 
mass variables 

Yij = siz /Q? = 2E;,E£;(1 — COS 9:5 )/Q? (14.77) 


for any two partons 7 and j; s12 will be included, though no singularity is 
involved. Here E; and E, are the (massless) parton energies, and 6;; is the 
angle between their 3-momenta, in the overall CMS. Then i and j are defined 
to be in one jet if y;; is less than some given number y. Note that for small 
bij, sij © E,E;6;,/Q?, so the single parameter y provides effectively both an 
energy and an angle cut. Clearly this definition is equivalent to a formulation 
in terms of strips 1 < £k < 1— y on figure 14.7, as discussed earlier. Including 
contributions, as before, from figures 14.22, 14.21, and 14.16, the resulting 
2-jet cross section is found to be (Kramer and Lampe 1987) 


2 Qs 
02-504 = Opt [1 — Soy 3Iny — 4ylny + 1 — 7°/3)]. (14.78) 


Terms of order y were calculated numerically. These include the contribution 
from the (non-singular) region yi2 < y, where the two quarks are in one jet 
and the other jet is a pure gluon jet. Plainly the IRC singularities have been 
eliminated from (14.78), at the cost of the jet mass resolution parameter y. 
Kramer and Lampe also calculated the order a? corrections to (14.78). 

These two ways of regulating the IRC divergences in the 2-jet partonic 
cross section have each been extensively developed into jet algorithms, as we 
shall briefly discuss in section 14.6.2. 


E: SSe 
14.6 Further developments 
14.6.1 Test of non-Abelian nature of QCD in ete” > 4 jets 


We have seen in section 14.3.1 how the colour factors associated with different 
QCD vertices (problem 14.5) play an important part in determining the rela- 
tive weights of different parton-level processes. The quark-gluon colour factor 
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Cr enters into the parton-level three-jet amplitude (14.67), but the triple- 
gluon vertex is not involved at order as. This vertex is an essential feature of 
non-Abelian gauge theories, being absent in Abelian theories such as QED. A 
direct measurement of the triple-gluon vertex colour factor, CA, can be made 
in the process ete” — 4 jets. 

4-jet events originate from the parton-level process ete” — qqg via three 
mechanisms: the emission of a second bremsstrahlung gluon, splitting of the 
first gluon into two gluons, and splitting of the first gluon into n¢ quark pairs. 
As problem 14.5 shows, these three types of splitting vertices are characterized 
in cross sections by the colour factors Cp, Ca and nrTa, so that the cross 
section can be written as (Ali and Kramer 2011) 


Qs 
O4-jet = (=) Cr[Cropp + CAU gg + NFTROaa]- (14.79) 


Measurements yield (Abbiendi et al. 2001) 


Ca/Cr = 2.29+ 0.06[stat.] == 0.14[syst.] 
Tr/Cp = 0.38 0.03[stat.] == 0.06[syst.], (14.80) 


in good agreement with the theoretical predictions C4/Cpr = 9/4 and Tr/Cp = 
3/8 in QCD. 


14.6.2 Jet algorithms 


From the examples already discussed in this chapter, it is clear that jets are an 
essential element in making comparisons between experimental measurements 
involving final state particles in detectors, and theoretical calculations at the 
parton level using perturbative QCD. Conceptually, jets provide a common 
representation for these two classes of event — those at the detector level, 
and those at the parton level. For precision comparisons, it is necessary to 
have a rigorous definition of a jet — a jet algorithm — which should be equally 
applicable at the detector, and at the parton, level. In the more than thirty 
years that have passed since Sterman and Weinberg’s 1977 paper, many jet 
definitions have been developed and applied. All involve the basic notion of 
clustering together objects that are in some sense ‘near’ to each other. Two 
main classes of jet algorithm may be distinguished: cone algorithms based on 
proximity in coordinate space, as in the Sterman-Weinberg approach, and used 
extensively, until recently, at hadron colliders; and sequential recombination 
algorithms based on proximity in momentum space, as in the jet-mass criterion 
of Kramer and Lampe (1987), and widely used at ete~ and e p colliders. 
Recent general reviews of jet algorithms are provided by Salam (2010) and by 
Ali and Kramer (2011); see also Ellis et al. (2008), Campbell et al. (2007), 
and Kluth (2006). Here we shall give only a brief introduction to sequential 
recombination algorithms — all of which are IRC safe — since it seems likely 
that they will dominate future jet analyses. 
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The JADE algorithm (Bartel et al. 1986, Bethke et al. 1988) is a promi- 
nent early example of sequential recombination algorithms applied in ete” 
annihilation reactions. Particles are clustered in a jet iteratively as long as 
the quantity y;; of (14.77) is less than some prescribed value ye. If for some 
pair (i,j), yij < Ye, particles ¿ and j are combined into a compound object 
(with the resultant 4-momentum, typically), and the process continues by 
pairing the compound with a new particle k. The procedure stops when all 
Yi distances are greater than yc, and the compounds that remain at this stage 
are the jets, by definition. 


One drawback with this scheme is that in higher orders of perturbation 
theory one meets terms of the form a? In?” y (generalizations of the as In” y 
term in (14.78)). Such terms can be large enough to invalidate a perturbative 
approach. Also, it is possible for two soft particles moving in opposite direc- 
tions to get combined in the early stages of clustering, which runs counter 
to the intuitive notion of a jet being restricted in angular radius. The k- 
algorithm (Catani et al. 1991) avoids these problems by replacing the y;; of 
(14.77) by 


vij = 2min.[E?, ES](1 — cos6;;)/Q?. (14.81) 


This amounts to defining ‘distance’ by the minimum transverse momentum 
k of the particles in nearby collinear pairs. The use of the minimum energy 
ensures that the distance between two soft, back-to-back particles is larger 
than that between a soft particle and a hard one that is close to it in angle. 
The k; algorithm was widely used at LEP. 


The basic idea of the k; algorithm was extended to hadron colliders (Ellis 
and Soper 1993, Catani et al. 1993), where the total energy of the hard 
scattering particles is not well defined experimentally. The distance measure 
Yij is replaced by 


dig = min. [p7 , p2] [lui — y)? + (bi — 9,?/R? (14.82) 


where, for particle 2, pi, is the transverse momentum with respect to the 
(beam) z-axis, y; is the rapidity along the beam axis (defined by y; = 3 In[(E;+ 
Dai) / (Ei —pzi)]), pi is the azimuthal angle in the plane transverse to the beam, 
and R is a jet parameter. The variables y;, 6; have the property that they are 
invariant under boosts along the beam direction. In addition, recombination 
with the beam jets is controlled by the quantity dij = kop , which is included 
along with the d;¿'s when recombining all the particles into (i) jets with non- 
zero transverse momentum, and (ii) beam jets. The power parameter p takes 
the value 1 in the (extended) k, algorithm, and -1 in the ‘anti-k,’ algorithm 
(Cacciari et al. 2008). Whereas the former (and p = 0) leads to irregularly 
shaped jet boundaries, the latter leads to cone-like boundaries. The choice 
p = —1 was made in early LHC analyses. 
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EE: a 


Problems 
14.1 
(a) Show that the antisymmetric 3q combination of equation (14.2) 


is (i) a determinant, and (ii) invariant under the transformation 
(14.14) for each colour wavefunction. 


(b) Suppose that pa and qa stand for two SU(3). colour wavefunctions, 
transforming under an infinitesimal SU(3). transformation via 


p = (1+in- A/2)p, 


and similarly for g. Consider the antisymmetric combination of 
their components, given by 


p293 — p3q2 Qı 
Pa — Pı |= | Q |; 
p1q2 — pon Q3 


that is, Qa = €a8+Pedy. Check that the three components Qa 
transform as a 3%, in the particular case for which only the pa- 
rameters 7,172,173 and mg are non-zero. [Note: you will need the 
explicit forms of the A matrices (appendix M); you need to verify 
the transformation law 


Q’ = (1 — in: A*/2)Q.] 
14.2 


(a) Verify that the normally ordered QCD interaction de" 3 raGt Aan is 
C-invariant. 


(b) Show that Xa Pgs transforms under C according to (14.36). 


14.3 Verify that the Lorentz-invariant ‘contraction’ €, po Fr” EP of two U(1) 
(Maxwell) field strength tensors is equal to 8E - B. 


14.4 Verify that the cross section for the exchange of a single massless scalar 
gluon between two quarks (or between a quark and an antiquark) contains no 
“1 412 factor. 


14.5 This problem is concerned with the evaluation of various “colour factors. 


(a) Consider first the colour factor needed for equation (14.73). The 
‘colour wavefunction’ part of the amplitude (14.59) is 


Lalahi l)e) (14.83) 


c 
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where cj, cz and c3 label the colour degree of freedom of the quark, 
antiquark and gluon respectively, and the sum on the index c has 
been indicated explicitly. The x's are the colour wavefunctions of 
the quark and antiquark, and are represented by three-component 
column vectors; a convenient choice is 


1 0 0 
x(r)= | 0], xb)=| 1], x9=]| 0 (14.84) 
0 0 1 


by analogy with the spin wavefunctions of SU(2). The cross section 
is obtained by forming the modulus squared of (14.83) and summing 
over the colour labels c;: 


Y alai) AP vaca) (ez) A lerdal) 


C,C1,€2,C3 


(14.85) 
where summation is understood on the matrix indices on the x's and 
A's, which have been indicated explicitly. In this form the expres- 
sion is very similar to the spin summations considered in chapter 8 
(cf equation (8.62)). We proceed to evaluate it as follows: 


(1) Show that 
S xs(c2)xi (c2) = Sau. 


c2 


(ii) Assuming the analogous result 


y ac(c3)aq(cz) = bed 


C3 


show that (14.85) becomes 


where the (implied) sum on r runs from 1 to 3. 


(iii) The expression 5, 4¢4¢ is just the Casimir operator Cs (see 


section M.5 in appendix M) for SU(3) in the fundamental represen- 
tation 3, which from (M.67) has the value C13, where 13 is the 
unit 3 x 3 matrix, and Cp = 4/3. Hence show that the colour factor 
for (14.73) is 4. 


Note that if we averaged over the colours of the initial quark, or 
considered one particular colour, the colour factor would be Cr. 
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(b) The colour part for the triple gluon vertex gı > g2 + g3 is 


5 ag(c2)ae (c3) faecac(c1). 


c,d,e 


Show that the modulus squared of this, averaged over the initial 
gluon colours and summed over the final gluon colours, is 


; 5 face faces 


c,d,e 


where each of c,d,e runs from 1 to 8. Deduce using (12.84) that 
this expression can be written as 


1 
ee (= apap) 
< d ee 


where ES, (d = 1...8) are the 8x8 matrices representing the gen- 
erators of SU(3) in the 8-dimensional (adjoint) representation (see 
section 12.2). The expression (>, GOGO) is the SU(3) Casimir 
operator C2 in the adjoint representation, which from (M.67) has 
the value C4l1g, where 13 is the 8 x 8 unit matrix, and C4 = 3. 
Hence show that the (averaged, summed) triple gluon vertex colour 
factor is Ca = 3. 


(c) The colour part of the g > q+ q vertex is 


Àc 


X% (c3)( 5) )rsXs (c2)ae(c1). 


Show that the modulus squared of this, averaged over the initial 
gluon colours and summed over the final quark colours is 


1 Ac Ac 1 
2 S 
This number is usually denoted by TR. 
14.6 Verify equation (14.60). 


14.7 Verify equation (14.72). 


14.8 Verify that expression (14.68) becomes the factor in large parentheses in 
equation (14.73), when expressed in terms of the x;'s. 


Taylor & Francis 
Taylor & Francis Group 


http://taylorandfrancis.com 
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QCD II: Asymptotic Freedom, the 
Renormalization Group, and Scaling 
Violations 


In the previous chapter we learned that QCD amplitudes contributing to 
ete” — jets generally have IRC singularities, but that finite physical cross 
sections can be obtained by including together kinematically indistinguishable 
final states. The partial cross sections (for example o(ete” — 2 jets)) will 
depend on the IRC cut-off parameter(s). What about the fully inclusive pro- 
cess ete” — hadrons, where all final states are summed over? At order aas, 
the parton-level diagrams contributing to this process are the same ones we 
considered in section 14.5, namely figures 14.16, 14.21 and 14.22. If we denote 
the amplitudes for these contributions by F;g (for real gluon emission), Fy, 
(for virtual gluon emission) and F} for the Born graph, then the partial cross 
section o(ete” — 2 jets) includes |F,|?, the interference term 2Re(F, Fi), 
and the integral of | Fu]? over strips near the boundaries of figure 14.17. At 
this order, the partial cross section o(ete” — 3 jets) is given by the integral 
of | Fra]? over the remaining (interior) region of figure 14.17. The correspond- 
ing total cross section is thus simply the sum of |F,|?, 2Re(F, F¿,), and the 
integral of |F,g]? over the whole of the zi — 12 phase space. Clearly the IRC 
singularities will cancel, as in the 2-jet cross section, and the result will not 
depend on any IRC cut-off parameter. Indeed, the result is (see for example 
Muta 2010, section 5.1.2) 


o(ete” — hadrons) = op¢(Q?)(1 + as/7). (15.1) 


This fully inclusive cross section is finite and free of IRC cut-offs. 


At first sight, this result might appear satisfactory. It predicts a cross 
section somewhat greater than opt, as is observed in figure 14.1 — from which 
we might infer that a, ~ 0.5 or less. Assuming the expansion parameter is 
Qs /T, the implied perturbation series in powers of a, would seem to be rapidly 
convergent. However, this is an illusion, which is dispelled as soon as we go 
to the next order in as (i.e. to the order a?a? in the cross section). 
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FIGURE 15.1 
Some higher-order processes contributing to ee” — hadrons at the parton 
level. 


E a 


15.1 Higher-order QCD corrections to o(ete~ > hadrons): 
large logarithms 


Some typical graphs contributing to this order of the cross section are shown 
in figure 15.1 (note that, as with the O(a?a,) terms, some graphs will con- 
tribute via their modulus squared and some via interference terms). The 
result was obtained numerically by Dine and Saperstein (1979), and analyti- 
cally by Chetyrkin et al. (1979) and by Celmaster and Gonsalves (1980). For 
our present purposes, the crucial feature of the answer is the appearance of a 
term 


ami | oS 0( 42/12) (15.2) 


where p is a mass scale (about which we shall shortly have a lot more to say, 
but which for the moment may be thought of as related in some way to an 
average quark mass), and the coefficient 69 is given by 


33 — 2Np 
bo = (A) (15.3) 


where Nt is the number of ‘active’ flavours (e.g. N; = 5 above the bb thresh- 
old). The term (15.2) raises the following problem. The ratio between it and 
the O(aa,) term is clearly 


— Boas In(Q?/p2). (15.4) 


If we take Np = 5, as © 0.4, 4 ~ 1 GeV and Q? ~ (10 GeV)?, (15.4) is of order 
1, and can in no sense be regarded as a small perturbation. Furthermore, the 
correction (15.4), by itself, would predict large scaling violations in this cross 
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section — that is, large Q?-dependent departures from the point-like Born cross 
section, 0p+(Q2). But the data actually follow the point-like prediction very 
well. 
Suppose that, nevertheless, we consider the sum of (15.1) and (15.2), which 
is m 
api[l + (1 — boas In(Q?/1:")}]- (15.5) 


This suggests that one effect, at least, of these higher-order corrections is to 
convert a, to a Q?-dependent quantity, namely as{1 — Boas In(Q2/u2)). We 
have seen something very like this before, in equation (11.56), for the case 
of QED. There is, however, one remarkable difference: here the coefficient of 
the In is negative, whereas that in (11.56) is positive. Apart from this (vital!) 
difference, however, we can reasonably begin to think in terms of an effective 
‘Q?-dependent strong coupling constant as(Q?). 

Pressing on with the next order (a?a3) terms, we encounter a term (Samuel 
and Surguladze 1991, Gorishnii et al. 1991) 


i =i (15.6) 


Opt [as Bo In(Q?/p2) 
and the ratio between this and (15.2) is precisely (15.4) once again! We are 
now strongly inclined to suspect that we are seeing, in this class of terms, an 
expansion of the form (1+2)~! =1—x+22—a3.... If true, this would imply 
that all terms of the form (15.2) and (15.6), and higher, sum up to give (cf 
(11.63) 
1+ a = 

1 + 0580 In(Q?/p2) 


The ‘re-summation’ effected by (15.7) has a remarkable effect: the ‘dangerous’ 
large logarithms in (15.2) and (15.6) are now effectively in the denominator 
(cf (11.56)), and their effect is such as to reduce the effective value of as as 
Q? increases — exactly the property of asymptotic freedom. 

We hasten to say that of course this is not how the property was discovered 
— which was, rather, through the calculations of Politzer (1973) and Gross 
and Wilczek (1973). Prior to their work, it was widely believed that any 
quantum field theory would have a running coupling which behaved like that 
of QED which, as we saw in section 11.5.3, increases for large Q? (short 
distances). Such behaviour would make the scaling violations due to a term 
like (15.7) even worse. It was therefore a mystery how quantum field theory 
could account for the small scaling violations seen in the data. The discovery 
that the running couplings of non-Abelian gauge theories became weaker at 
large Q? opened the way for a quantitative understanding of parton-model 
scaling, and perturbative QCD corrections to it. 

To place the asymptotic freedom calculation in its proper context requires 
a considerable detour. Referring to our previous discussion, we may ask: are 
we guaranteed that still-higher-order terms will indeed continue to contain 
pieces corresponding to the expression of (15.7)? And what exactly is the 


(15.7) 


Opt 
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FIGURE 15.2 
One-loop vacuum polarization contribution to Z3. 


mass parameter u? Answering these questions will lead to the important 
body of ideas going under the name of the ‘renormalization group’. 


E + 
15.2 The renormalization group and related ideas in QED 
15.2.1 Where do the large logs come from? 


We have taken the title of this section from that of Section 18.1 in Weinberg 
(1996), which we have found very illuminating, and to which we refer for a 
more detailed discussion. 


As we have just mentioned, the phenomenon of ‘large logarithms’ arises 
also in the simpler case of QED. There, however, the factor corresponding 
to aso ~ + is a/3m ~ 1073, so that it is only at quite unrealistically enor- 
mous |q?| values that the corresponding factor (a/37) In(|g?|/m2) (where me 
is the electron mass) becomes of order unity. Nevertheless, the origin of the 
logarithmic term is essentially the same in both cases, and the technicalities 
are much simpler for QED (no photon self-interactions, no ghosts). We shall 
therefore forget about QCD for a while, and concentrate on QED. Indeed, the 
discussion of renormalization of QED given in chapter 11 will be sufficient to 


answer the question in the title of this subsection. 

For the answer does, in fact, fundamentally have to do with renormaliza- 
tion. Let us go back to the renormalization of the charge in QED. We learned 
in chapter 11 that the renormalized charge e was given in terms of the ‘bare’ 

1 
charge ey by the relation e = eyp(Z2/Z1)23 (see (11.6)), where in fact due 
1 
to the Ward identity Z and Z2 are equal (section 11.6), so that only Z? is 
needed. To order e? in renormalized perturbation theory, including only the 
ete” loop of figure 15.2, Z3 is given by (cf (11.31)) 


ZEI = 1 + 1110) (15.8) 
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where, from (11.23) and (11.24), 


dK 2(1— 2) 
na A „AI 
sei fa of = o (15.9) 


and A, = m? — x(1 — x)q? with q? < 0. We regularize the k’ integral by a 
cut-off A as explained in sections 10.3.1 and 10.3.2, obtaining (problem 15.1) 


A ue A, 
III (q )=-5f dx x(1 — x) fa (E = 2 apa) 


(PFA? 
(15.10) 


Setting q? = 0 and retaining the dominant In A term, we find that 


(zP) e (=) In(A/7me). (15.11) 


It is not a coincidence that the coefficient a/3z of the ultraviolet divergence 
is also the coefficient of the In(|q?|/m2) term in (11.55)-(11.57); we need to 
understand why. 

We first recall how (11.55) was arrived at. It refers to the renormalized 
self-energy part, which is defined by the ‘subtracted’ form 


rT[2 2 2 2 2 
TP (4?) = P(g?) — TP (0). (15.12) 


In the process of subtraction, the dependence on the cut-off A disappears and 
we are left with 


2 


= 2a f* m 
[2] (,2) — e 
II (q?) = | da «(1 — x) In woes Pala) (15.13) 


as in equation (11.34). For large values of |q?| this leads to the ‘large log’ 
term (a/37) In(|q?|/m?). Now, in order to form such a term, it is obviously 
not possible to have just ‘In |q?|’ appearing: the argument of the logarithm 
must be dimensionless, so that some mass scale must be present, to which 
|q2| can be compared. In the present case, that mass scale is evidently me, 
which entered via the quantity me (0), or equivalently via the renormalization 
constant zł! (cf (15.11)). This is the beginning of the answer to our questions. 

Why is it me that enters into TI! (0) or Z3? Part of the answer — once 
again — is of course that a ‘In A’ cannot appear in that form, but must be 
‘In(A/some mass)’. So we must enquire: what determines the ‘some mass’? 
With this question we have reached the heart of the problem (for the moment). 
The answer is, in fact, not immediately obvious: it lies in the prescription used 
to define the renormalized coupling constant; this prescription, whatever it is, 
determines Z3. 

The value (15.8) (or (11.31)) was determined from the requirement that 
the O(e2) corrected photon propagator (in £ = 1 gauge) had the simple form 
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—igu»/q? as q? > 0; that is, as the photon goes on-shell. Now, this is a 
perfectly ‘natural’ definition of the renormalized charge — but it is by no means 
forced upon us. In fact the appearance of a singularity in ze as me — 0 
suggests that it is inappropriate to the case in which fermion masses are 
neglected. We could in principle choose a different value of q?, say q? = —?, 
at which to ‘subtract’. Certainly the difference between me (q2 = 0) and 
me! (q? = —7) is finite as A — oo, so such a redefinition of ‘the’ renormalized 
charge would only amount to a finite shift. Nevertheless, even a finite shift 
is alarming, to those accustomed to a certain ‘sanctity’ in the value a = oH! 
We have to concede, however, that if the point of renormalization is to render 
amplitudes finite by taking certain constants from experiment, then any choice 
of such constants should be as good as any other — for example, the ‘charge’ 
defined at q? = — u? rather than at q? = 0. 

Thus there is, actually, a considerable arbitrariness in the way renormal- 
ization can be done — a fact to which we did not draw attention in our earlier 
discussions in chapters 10 and 11. Nevertheless, it must somehow be the case 
that, despite this arbitrariness, physical results remain the same. We shall 
come back to this important point shortly. 


15.2.2 Changing the renormalization scale 
The recognition that the renormalization scale (—p? in this case) is arbitrary 


suggests a way in which we might exploit the situation, so as to avoid large 


‘In(|q?|/m2)’ terms: we renormalize at a large value of u?! Consider what 


happens if we define a new ze by 
Zu) = 1 + 018 (q? = —p2). (15.14) 
Then for u? > m?, but u? < A?, we have 
1 
(Pw)? =1- (E) myn), (15.15) 
T 


and a new renormalized self-energy 


EP, u) = Pe) — Pg = —p?) 
2 pl rar 
e mé+péx(l — a) 
= == d 1 — 2) In | |. (15.1 
a ], x x yn | Se (15.16) 


For u? and —q? both > m2, the logarithm is now In(|q?|/u?) which is small 
when |q?| is of order u?. It seems, therefore, that with this different renor- 
malization prescription we have ‘tamed’ the large logarithms. 

However, we have forgotten that, for consistency, the “e” we should now be 
using is the one defined, in terms of eg, via 


en = (ZP)? co = (1-Em(8/0) es (15.17) 
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rather than 
£ 
2 (34 
e= (z?) ? eo = (1 -> In(A/me)) eo, (15.18) 


working always to one-loop order with an ete” loop. The relation between 
e, and e is then 


= tm) 
a ÓN 


— In(p/me)) e (15.19) 
37 

to leading order in a. Equation (15.19) indeed represents, as anticipated, 
a finite shift from ‘e’ to “ep”, but the problem with it is that a ‘large log’ 
has resurfaced in the form of In(4/me) (remember that our idea was to take 
u? > m2). Although the numerical coefficient of the log in (15.19) is certainly 
small, a similar procedure applied to QCD will involve the larger coefficient 
Boa as in (15.5), and the correction analogous to (15.19) will be of order 1, 
invalidating the approach. 

We have to be more subtle. Instead of making one jump from m2 to a 
large value u°, we need to proceed in stages. We can calculate e, from e as 
long as p is not too different from me. Then we can proceed to e for pu! 
not too different from pu, and so on. Rather than thinking of such a process 
in discrete stages me > u > u’ > ..., it is more convenient to consider 
infinitesimal steps — that is, we regard e, at the scale u’ as being a continuous 
function of e, at scale u, and of whatever other dimensionless variables exist 
in the problem (since the e’s are themselves dimensionless). In the present 
case, these other variables are p’/ and me/p, so that e,, must have the 
form 


ey = Eley, p/u me/ u). (15.20) 


Differentiating (15.20) with respect to y” and letting yw’ = u we obtain 


de, 


di = Berl us Me/ H) (15.21) 


H 


where 


9 
Bem(Cu,Me/H) = DE (€m 2, me/ hi) : (15.22) 


z=1 


For u > me equation (15.21) reduces to 


de, 


Pay = Peml€y, 0) = Bem(ep), (15.23) 


which is a form of Callan-Symanzik equation (Callan 1970, Symanzik 1970); 
it governs the change of the coupling constant e,, as the renormalization scale 
u changes. 
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To this one-loop order, it is easy to calculate the crucial quantity Gem(e,.). 
Returning to (15.17), we may write the bare coupling eo as 


by EE en (1 - Š n(A/n)) 


¿ 
y 
E 
AN 
= 
| 
T 
s| 
g> 
a 
> 
os 
= 
wa 
Mn” 


(15.24) 


where the last step follows from the fact that e and e, differ by O(e?), which 
would be a higher-order correction to (15.24). Now the unrenormalized cou- 
pling is certainly independent of u. Hence, differentiating (15.24) with respect 
to u at fixed ey, we find 


dea 
du 


=0. (15.25) 


eo 
Working, to order en we can drop the last term in (15.25), obtaining finally 
(to one-loop order) 


de, 
Pay 


3 
= (5 Alten) - (15.26) 


eo 


We can now integrate equation (15.26) to obtain e, at an arbitrary scale p, 
in terms of its value at some scale u = M, chosen in practice large enough so 
that for variable scales u greater than M we can neglect me compared with p, 
but small enough so that In(M/me) terms do not invalidate the perturbation 
theory calculation of em from e. The solution of (15.26) is then (problem 
15.2) 


1 1 
In(u/M) = 67? (= = =) (15:27) 
EM en 
or equivalently 
2 
2 =—__“M __ (15.28) 
1 — Ala In (42 /M2) 
which is dy 
ay, x (15.29) 


de SAL In (u? /M?) 


where a = e2/4x. The crucial point is that the ‘large log’ is now in the 
denominator (and has coefficient am /37!). We note that the general solution 
of (15.23) may be written as 


tu de 
em Bem(e) 
We have made progress in understanding how the coupling changes as 


the renormalization scale changes, and how ‘large logarithmic’ change as in 
(15.19) can be brought under control via (15.29). The final piece in the puzzle 


In(u/M) = (15.30) 
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is to understand how this can help us with the large —q? behaviour of our 
cross section, the problem we originally started from. 


15.2.3 The RGE and large —q? behaviour in QED 


To see the connection we need to implement the fundamental requirement, 
stated at the end of section 15.2.2, that predictions for physically measurable 
quantities must not depend on the renormalization scale u. Consider, for 
example, our annihilation cross section o for ete” — hadrons, pretending 
that the one-loop corrections we are interested in are those due to QED rather 
than QCD. We need to work in the spacelike region, so as to be consistent 
with all the foregoing discussion. To make this clear, we shall now denote 
the 4-momentum of the virtual photon by q rather than Q, and take q? < 
0 as in sections 15.2.1 and 15.2.2. Bearing in mind the way we used the 
‘dimensionless-ness’ of the e’s in (15.20), let us focus on the dimensionless 
ratio o/op, = S. Neglecting all masses, S can only be a function of the 
dimensionless ratio |q2]/u? and of e,,: 


S = S(\a?|/u?,e,): (15.31) 


But S must ultimately have no y dependence. It follows that the u? depen- 
dence arising via the ]q?]/u? argument must cancel that associated with en. 
This is why the pu?-dependence of e,, controls the |g?| dependence of S, and 
hence of o. In symbols, this condition is represented by the equation 


o de, 9 Di să îi 
(2 y du pa =) S (la [/p sén) = 0, (15.32) 
or 
AZ] + Bolen) ) 5 (laP?/n2,e,) =0 (15.33) 
Daly dea ER 


Equation (15.33) is referred to as ‘the renormalization group equation 
(RGE) for S”. The terminology goes back to Stueckelberg and Peterman 
(1953), who were the first to discuss the freedom associated with the choice 
of renormalization scale. The ‘group’ connotation is a trifle obscure — but all 
it really amounts to is the idea that if we do one infinitesimal shift in y?, and 
then another, the result will be a third such shift; in other words, it is a kind of 
‘translation group’. It was, however, Gell-Mann and Low (1954) who realized 
how equation (15.33) could be used to calculate the large |q?| behaviour of S, 
as we now explain. 

It is convenient to work in terms of u? and a rather than y and e. Equation 
(15.33) is then 


9 
+ Bonn) 52> S (la?|/u?, ou) = 0, (15.34) 
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where Bem(a,,) is defined by 


2 dou 


Bem(Qp) = H Op B (15.35) 
From (15.35) and (15.26) we deduce that, to the one-loop order to which we 


are working, 


2 
e Q 
BEM On) = e (15.36) 
Now introduce the important variable 
t = In(\q"|/1”). (15.37) 
Equation (15.34) then becomes 
o O i 
E em. pi, A = 1 . 
| z tÊ (a xe-| ste au) =0 (15.38) 


This is a first-order differential equation which can be solved by implicitly 
defining a new function — the running coupling a(\q?|) — as follows (compare 


(15.30): 
al?) da 
= I Beale (15.39) 


To see how this helps, we have to recall how to differentiate an integral with 
respect to one of its limits — or, more generally, the formulae 


f(a) 
2f glade = 9(F(a)) SE. (15.40) 


First, let us differentiate (15.39) with respect to t at fixed ap; we obtain 


_ 1 2P) 
Bent) ot 


Next, differentiate (15.39) with respect to a, at fixed t (note that a(|q?|) will 
depend on p and hence on ay); we obtain 


1 (15.41) 


doll?) aa 
oe od POU) Bula) eed 


the minus sign coming from the fact that a, is the lower limit in (15.39). 
From (15.41) and (15.42) we find 


0 


== + BemlQy) g 


E 2 = 
> Der: a(l“) = 0. (15.43) 


It follows that S(1, a(|q?|)) is a solution of (15.38). 
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This is a remarkable result. It shows that all the dependence of S on 
the (momentum)? variable |q?| enters through that of the running coupling 
a(|q?|). Of course, this result is only valid in a regime of —q? which is much 
greater than all quantities with dimension (mass)? — for example the squares of 
all particle masses, which do not appear in (15.31). This is why the technique 
applies only at ‘high’ —q?. The result implies that if we can calculate S(1,a,) 
(i.e. S at the point q? = —p”) at some definite order in perturbation theory, 
then replacing a, by a(|q?|) will allow us to predict the q?-dependence (at 
large —q2). All we need to do is solve (15.39). Indeed, for QED with one 


ete” loop we have seen that Bel (a) = 02/37. Hence integrating (15.39) we 


obtain 
On On 


1-35 1-32 In(\@?|/n?) 


This is almost exactly the formula we proposed in (11.57), on plausibility 
grounds.! 

Suppose now that the leading QED perturbative contribution to S(1,a,) 
is Sia, Then the terms contained in S(1,a(|q?|)) in this approximation can 
be found by expanding in powers of aj: 


a(la”|) = (15.44) 


a —1 
Sata?) = 1+ Soll) = 14 Sia, fi- 244] 
= 1450, 14 eta (ut 4 (15.45) 
= ISA A Er T] 


where t = In(|g?|/p?). The next-higher-order calculation of S(1,a,,) would be 
S202, say, which generates the terms 


apt 
Saa” (la”|) = Saa, [1+ = +l (15.46) 
T 


Comparing (15.45) and (15.46) we see that each power of the large log factor 
appearing in (15.46) comes with one more power of a, than in (15.45). Pro- 
vided a, is small, then, the leading terms in t,t?, ... are contained in (15.45). 
It is in this sense that replacing S(1,a,) by S(1,a(|q?|)) sums all ‘leading log 
terms’. 

In fact, of course, the one-loop (and higher) corrections to S in which we 
are really interested are those due to QCD, rather than QED, corrections. But 
the logic is exactly the same. The leading (O(as)) perturbative contribution 
to S = 0/0) at q? = —p? is given in (15.1) as as(u?)/r. It follows that 
the ‘leading log corrections’ at high —q? are summed up by replacing this 
expression by 4s(|q2])/7, where the running as(|q?|) is determined by solving 
(15.39) with the QCD analogue of (15.36) — to which we now turn. 


1The difference has to do, of course, with the different renormalization prescriptions. Eq 
(11.57) is written in terms of an ‘a’ defined at q? = 0, and without neglect of me. 
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15.3 Back to QCD: asymptotic freedom 
15.3.1 One loop calculation 


The reader will of course have realized, some time back, that the quantity Po 
introduced in (15.3) must be precisely the coefficient of a? in the one-loop 
contribution to the P-function of QCD defined by 


pei T fixed bare as aan 
that is to say, 
Bs (one loop) = — boa? (15.48) 
with 
Bo = m ui (15.49) 


For N; < 16 the quantity Bo is positive, so that the sign of (15.48)) is opposite 
to that of the QED analogue, equation (15.36). Correspondingly, (15.44) is 
replaced by 


2 asu?) 

as(la”]) EETA (15.50) 

where Q? = |q?|.? Then replacing as in (15.1) by (15.50) leads to (15.7). 
Thus in QCD the strong coupling runs in the opposite way to QED, be- 
coming smaller at large values of Q? (or small distances) — the property of 
asymptotic freedom. The justly famous result (15.49) was first obtained by 
Politzer (1973), Gross and Wilczek (1973), and ’t Hooft. °t Hooft’s result, 
announced at a conference in Marseilles in 1972, was not published. The 
published calculation of Politzer and of Gross and Wilczek quickly attracted 
enormous interest, because it immediately offered a way to understand how 
the successful parton model could be reconciled with the undoubtedly very 
strong binding forces between quarks. The resolution, we now understand, 
lies in quite subtle properties of renormalized quantum field theory, involving 
first the exposure of ‘large logarithms’, then their re-summation in terms of 
the running coupling, and of course the crucial sign of the P-function. Not 
only did the result (15.49) explain the success of the parton model: it also, 
we repeat, opened the prospect of performing reliable perturbative calcula- 
tions in a strongly interacting theory, at least at high Q?. For example, at 
sufficiently high Q?, we can reliably compute the 6 function in perturbation 
theory. The result of Politzer and of Gross and Wilczek, when combined with 


2Except that in (15.50) as is evaluated at large spacelike values of q2, whereas in (15.7) 
it is wanted at large timelike values. Readers troubled by this may consult Peskin and 
Schroeder (1995) section 18.5. The difficulty is evaded in the approach of section 15.6 
below. 
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q 


Al 


FIGURE 15.3 
qq vacuum polarization correction to the gluon propagator. 


the motivations for a colour SU(3) group discussed in the previous chapter, led 
rapidly to the general acceptance of QCD as the theory of strong interactions, 
a conclusion reinforced by the demonstration by Coleman and Gross (1973) 
that no theory without Yang-Mills fields possessed the property of asymptotic 
freedom. 

In section 11.5.3 we gave the conventional physical interpretation of the 
way in which the running of the QED coupling tends to increase its value at 
distances short enough to probe inside the screening provided by ete” pairs 
(la| 7} < mz+). This vacuum polarization screening effect is also present in 
(15.49) via the term A, the value of which can be quite easily understood. 
It arises from the ‘qq’ vacuum polarization diagram of figure 15.3, which is 
precisely analogous to the ete” diagram used to calculate me (4?) in QED. 
The only new feature in figure 15.3 is the presence of the 2-matrices at each 
vertex. If ‘a’ and ‘b’ are the colour labels of the ingoing and outgoing gluons, 
the 2-matrix factors must be 


ba a). (3), (15.51) 


since there are no free quark indices (of type a, 8) on the external legs of the 
diagram. It is simple to check that (15.51) has the value 45ab (this is, in fact, 
the way the A's are conventionally normalized). Hence for one quark flavour 
we expect ‘a/3z’ to be replaced by “a,/6r”, in agreement with the second 
term in (15.49). 

The all-important, positive, first term must therefore be due to the gluons. 
The one-loop graphs contributing to the calculation of 89 are shown in figure 
15.4. They include figure 15.3, of course, but there are also, characteristically, 
graphs involving the gluon self-coupling which is absent in QED, and also (in 
covariant gauges) ghost loops. We do not want to enter into the details of the 
calculation of 6(as) here (they are given in Peskin and Schroeder 1995, chapter 
16, for example), but it would be nice to have a simple intuitive picture of the 
‘antiscreening’ result in terms of the gluon interactions, say. Unfortunately no 
fully satisfactory simple explanation exists, though the reader may be inter- 
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FIGURE 15.4 

Graphs contributing to the one-loop f function in QCD. The curly line rep- 
resents a gluon, a dotted line a ghost (see section 13.3.3 ) and a straight line 
a quark. 


ested to consult Hughes (1980, 1981) and Nielsen (1981) for a ‘paramagnetic’ 
type of explanation, rather than a ‘dielectric’ one. 

Returning to (15.50), we note that the equation effectively provides a pre- 
diction of a at any scale Q?, given its value at a particular scale Q? = p2, 
which must be taken from experiment. The reference scale is now normally 
taken to be the ZO mass; the value as(m2) then plays the role in QCD that 
a ~ 1/137 does in QED. 

Despite appearances, equation (15.50) does not really involve two param- 
eters — after all, (15.47) is only a first-order differential equation. By intro- 
ducing 


In Añop = Inp2 — 1/(Boas(42)), (15.52) 
equation (15.50) can be rewritten (problem 15.3) as 
1 
a,(Q?) = —==—. 15.53 
Doy) iii 
Equation (15.53) is equivalent to (cf (15.30) 
ia da 
In{O*/ Aden) = / oo 15.54 
iey dop) a.(Q2) Bs(one loop) ( ) 
with Bs(one loop) = —Boa2. Agcp is therefore an integration constant, rep- 


resenting the scale at which a, would diverge to infinity (if we extended our 
calculation beyond its perturbative domain of validity). More usefully, Agcp 
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is a measure of the scale at which a really does become ‘strong’. The extrac- 
tion of a value of Agcp is a somewhat complicated matter, as we shall briefly 
indicate in the following section, but a typical value is in the region of 200 
MeV. Note that this is a distance scale of order (200 MeV)~! ~ 1 fm, just 
about the size of a hadron — a satisfactory connection. 


15.3.2 Higher-order calculations, and experimental compar- 
ison 

So far we have discussed only the ‘one-loop’ calculation of P(as). The general 

perturbative expansion for 6; can be written as 


Balas) = -boak — Braz — Braz +... (15.55) 


where fo is the one-loop coefficient given in (15.49), 81 is the two-loop coeffi- 
cient, and so on. fa was calculated by Caswell (1974) and Jones (1974), and 


has the value 
_ 153 — 19Np 


Pa = gară 
The three-loop coefficient 82, obtained by Tarasov et al. (1980) and by Larin 
and Vermaseren (1993), is 


(15.56) 


__ 77139 — 15099; + 325N? 
= 345672 i 


The four-loop coefficient 33 was calculated by van Ritbergen et al. (1997) and 
by Czakon (2005); we shall not give it here. A technical point to note is that 
while 69 and ĝı are independent of the scheme adopted for renormalization 
(see appendix O), the higher-order coefficients do depend on it; the value 
(15.57) is in the widely used MS scheme. Likewise, Agcp will be scheme- 
dependent (see appendix O), and the value Aygg will be used here (the ‘QCD’ 
now being understood). 

Only in the one-loop approximation for s can an analytic solution of 
(15.47) be obtained. However, a useful approximate solution can be found 
iteratively, as follows. Consider the two-loop version of (15.54), namely 


(15.57) 


da 
In(Q?/A2.) = — J ——— 15. 
NN J Boak-+ Bal ae 
Expanding the denominator and integrating gives 
1 by 
In(Q2/A2—) = +—Ina,+C, 15.59 
AD = Zo + 3 (15.59) 


where bi = 51/80 and C is a constant. In the MS scheme, C' is given by 
C = (b1/80) n Bo. Then the equation for as is 


1 by 
Boas o Po ( ) 


L 
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where we have defined L = In(Q?/A2_.). In first approximation, one sets bı 
to zero and finds ag = (1/foL) as before. To obtain the next approximation, 
we set as = (1/BoL) in the bı term of (15.60), and solve for a to first order 
in bı. This gives (problem 15.4 (a)) 


1 1 


s = —— oes A L. 15.61 
“=p gro wee 


Problem 15.4 (b) carries the calculation to the three-loop stage. 
The current world average value of as(m2) is (Bethke 2009) 


as(mz) = 0.1184 + 0.0007. (15.62) 


The remarkable precision of this number represents extraordinary consistency 
among the many methods used to determine it?, which include deep inelastic 
scattering, electroweak fits, eFe” — jets, and lattice calculations (see chapter 
16). If (15.62) is used to determine Ayg from (15.61), one finds Aqg = 231 
MeV; using the 3-loop formula of problem 15.4 (b) gives Ayg = 213 MeV 
(Bethke 2009). 

These values of Ayqg are for Nr = 5, appropriate for the Z° mass region, 
well above the b threshold. As Q? runs to smaller values, and a quark mass 
threshold is crossed, N; changes by one unit, and so correspondingly do the 
coefficients Bo, P1,.... Physical quantities must however be continuous across 
a quark threshold. This requires that the values of a, above and below that 
threshold satisfy certain matching conditions (Rodrigo and Santamaria 1993, 
Bernreuther and Wetzel 1982, Chetyrkin et al. 1997). These are satisfied by 
allowing Ayg to depend on Ng. At one and two loop order, the matching con- 
dition is simply aT} = ¿NO 
in terms of Ae and Ae In higher orders the matching conditions 
contain additional terms, which are required at (n—1)-loop order for an n-loop 
calculation of ag. 

Figure 15.5 shows a summary (Bethke 2009) of measurements of as as 
a function of the energy scale Q, compared with the QCD prediction. The 
latter is evaluated in 4-loop approximation, using 3-loop threshold matching 
conditions at the masses m. = 1.5 GeV and my = 4.7 GeV. The agreement is 
perfect, a triumph for both experiment and theory. 


aj , which can be straightforwardly implemented 


re a 


15.4 o(ete” > hadrons) revisited 


We may now return to the physical process which originally motivated this 
extensive detour. The perturbative corrections to opt(Q?) are expressed as a 


3With the exception of a long-standing systematic difference: results from structure 
functions prefer a smaller value of as (m2) than most of the others. 
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FIGURE 15.5 
Comparison between measurements of a, and the theoretical prediction, as a 
function of the energy scale Q (Bethke 2009). (See color plate I.) 


power series in Qs, 


o(ete” — hadrons) = op+(Q?) 


1+ Sent? (=a , (15.63) 


where p is the renormalization scale. (A similar expansion can be written for 
many other physical quantities too.) The coefficients from cz onwards depend 
on the renormalization scheme (see appendix O), and are usually quoted in 
the MS scheme. ca is the leading order (LO) coefficient, and we already know 
that cı = 1 from (15.1). cz is the next-to-leading (NLO) coefficient; c2(1) 
was calculated by Dine and Sapirstein (1979), Chetyrkin et al. (1979) and by 
Celmaster and Gonsalves (1980), and has the value 1.9857 — 0.1152N. The 
next-to-next-to-leading (NNLO) coefficient c3(1) was calculated by Samuel 
and Surguladze (1991) and by Gorishnii et al. (1991), and is equal to -12.8 
for five flavours. The N3LO coefficient c4(1) (which requires the evaluation of 
some twenty thousand diagrams) may be found in Baikov et al. (2008) and 
Baikov et al. (2009). 

The physical cross section o(ete” — hadrons) must be independent of the 
renormalization scale u?, and this would also be true of the series in (15.63) if 
an infinite number of terms were kept: the p2-dependence of the coefficients 
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Cn(Q2/u2) would cancel that of as(u?). This requirement can be imposed 
order by order in a to fix the p?-dependence of the coefficients, and is a 
direct way of applying the RGE idea. Consider, for example, truncating the 
series at the n = 2 stage: 


o(ete” — hadrons) ~ op+(Q?) (1 + os) + €2(Q?/u?)(as(u?)/m)? 


(15.64) 
Differentiating with respect to u? and setting the result to zero we obtain 


y deg _ _TBlas(u”)) (15.65) 


= 
dp? (as(u?))? 

where an O(a3) term has been dropped. Substituting the one-loop result 
(15.48) — as is consistent to this order — we find 


ca(Q?/p?) = ca(1) — Bo In(Q?/p”). (15.66) 


The second term on the right-hand side of (15.66) gives the contribution iden- 
tified in (15.2). 

In practice only a finite number of terms n = N will be available, and a p?- 
dependence will remain, which implies an uncertainty in the prediction of the 
cross section (and similar physical observables), due to the arbitrariness of the 
scale choice. This uncertainty will be of the same order as the neglected terms, 
i.e. of order aN+!. Thus the scale dependence of a QCD prediction gives a 
measure of the uncertainties due to neglected terms. For o(eFe” — hadrons) 
the choice of scale u? = Q? is usually made, so as to avoid large logarithms 
in relations such as (15.66). 

Before proceeding to our second main application of the RGE, scaling 
violations in deep inelastic scattering, it is necessary to take another detour, 
to enlarge our understanding of the scope of the RGE. 


E a 


15.5 A more general form of the RGE: anomalous 
dimensions and running masses 


The reader may have wondered why, for QCD, all the graphs of figure 15.6 
are needed, whereas for QED we got away with only figure 11.3. The reason 
for the simplification in QED was the equality between the renormalization 
constants Zı and Z2, which therefore cancelled out in the relation between 
the renormalized and bare charges e and eo, as briefly stated before equation 
(15.8) (this equality was discussed in section 11.6). We recall that Za is the 
field strength renormalization factor for the charged fermion in QED, and Za 
is the vertex part renormalization constant; their relation to the counter terms 
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was given in equation (11.7). For QCD, although gauge invariance does imply 
generalizations of the Ward identity used to prove Zi = Za (Taylor 1971, 
Slavnov 1972), the consequence is no longer the simple relation ‘Z, = Zo’ 
in this case, due essentially to the ghost contributions. In order to see what 
change Zı 4 Z2 would make, let us return to the one-loop calculation of 3 for 
QED, pretending that Zi 4 Za. We have 


en e (15.67) 


where, because we are renormalizing at scale pi, all the Z;’s depend on y (as 
n (15.15)), but we shall now not indicate this explicitly. Taking logs and 
differentiating with respect to y at constant ey, we obtain 


d 1 d d 
a i Ze aaa a E la Pa eee A i508) 
dul., dul., 2° dufe ey du Ls 
Hence 
de, d 
Plex) = pa du = €u Y + dep — Cuh p In Za, (15.69) 
where i TE 
= -u—| InZ = -u—| lnZ3. 15.70 
Y2 2” Ta n42, %3 2"du n 43 ( ) 
eo eo 


To leading order in e,,, the y3 term in (15.70) reproduces (15.26) when (15.15) 
is used for Z3, the other two terms in (15.68) cancelling via Zi = Za. So if, 
as in the case of QCD, Zi is not equal to Z2, we need to introduce the con- 
tributions from loops determining the fermion field strength renormalization 
factor, as well as those related to the vertex parts (together with appropriate 
ghost loops), in addition to the vacuum polarization loop associated in the 
Z3. 

Quantities such as yg and y3 have an interesting and important signifi- 
cance, which we shall illustrate in the case of y2 for QED. Z2 enters into the 
relation between the propagator of the bare fermion (Q|T (wo(x)4o(0))|Q) and 
the renormalized one, via (cf (11.2)) 


(Q(T (b(a)h(0)|2) = q (AIP (o(e¥o(0))10), (15.71) 


where (cf section 10.1.3) |Q) is the vacuum of the interacting theory. The 
Fourier transform of (15.71) is, of course, the Feynman propagator: 


Spa”) = furor )6(0))10). (15.72) 
Suppose we now ask: what is the large —q? behaviour of (15.72) for space-like 


q?, with —q? > m? where m is the fermion mass? This sounds very similar 
to the question answered in 15.2.3 for the quantity S(|q?|/j?, ep). However, 
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the latter was dimensionless whereas (recalling that w has mass dimension 3) 
S',(q2) has dimension M7*. This dimensionality is, of course, just what a 
propagator of the free-field form î/(4 — m) would provide. 

Accordingly, we extract this (4)7* factor (compare 0/0p:) and consider 
the dimensionless ratio R’,(|q?|/u?, a) = 45% (q?). We might guess that, just 
as for S(|q?|/p?,a,,), to get the leading large |q?| behaviour we will need to 
calculate Ri, to some order in a,,, and then replace a, by a(|q?|/u?). But this 
is not quite all. The factor Z in (15.71) will — as noted above — depend on 
the renormalization scale u, just as Za of (15.15) did. Thus when we change 


L, the normalization of the Vs will change via the ză factors — of course by 
a finite amount here — and we must include this change when writing down 
the analogue of (15.33) for this case (i.e. the condition that the ‘total change, 
on changing u, is zero”). The required equation is 


O £ 
+ Blan) 7 + lau) Rp(|q?|/n?, ay) = 0. (15.73) 
A H 

The solution of (15.73) is somewhat more complicated than that of (15.33). 
We can gain insight into the essential difference caused by the presence of y2 
by considering the special case 6(a,,) = 0. In this case, we easily find 


Bia (|q?|/ 12, ap) ox (u2) Plen). (15.74) 


But since Rp can only depend on y via |q?|/u?, we learn that if 8 = 0 then 
the large |q?| behaviour of Rj, is given by (]q2|/u2)2 — or, in other words, 
that at large |q?| 


Slate 1 (dl ae 15.75 
(la i) OCG 2 (15.75) 


Thus, at a zero of the B-function, Se has an ‘anomalous’power law dependence 
on |q?| (i.e. in addition to the obvious g~t factor), which is controlled by the 
parameter y2. The latter is called the ‘anomalous dimension’ of the fermion 
field, since its presence effectively means that the |q?| behaviour of S%, is not 
determined by its ‘normal’ dimensionality ML. The behaviour (15.75) is 
often referred to as ‘scaling with anomalous dimension’, meaning that if we 
multiply |q?| by a scale factor A, then Sp is multiplied by \72(¢)~! rather than 
just A71. Anomalous dimensions turn out to play a vital role in the theory of 
critical phenomena — they are, in fact, closely related to ‘critical exponents’ 
(see section 16.4.3, and Peskin and Schroeder 1995, chapter 13). Scaling with 
anomalous dimensions is also exactly what occurs in deep inelastic scattering 
of leptons from nucleons, as we shall see in section 15.6. 

The full solution of (15.73) for 6 4 0 is elegantly discussed in Coleman 
(1985), chapter 3; see also Peskin and Schroeder (1995) section 12.3. We quote 
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a* 


(a) 


FIGURE 15.6 

Possible behaviour of 8 functions. (a) The slope is positive near the origin (as 
in QED), and negative near a = a*. (b) The slope is negative at the origin 
(as in QCD), and positive near a, = až. 


it here: 
t 
R(la?1/17), ay), = Rp(1, a(|q"|/n")) exp Vi dale). (15.76) 
0 
The first factor is the expected one from section 15.2.3; the second results 
from the addition of the y2 term in (15.73). Suppose now that (a) has a 


zero at some point a*, in the vicinity of which P(a) ~ —B(a—a*) with B > 0. 
Then, near this point the evolution of a is given by (cf (15.39)) 


a pata) e 
1 = PD 15.77 
a(lal/p2) L erat (15.77) 
which implies 
a(|q?|) = a* + constant x (p?/1g?)y?. (15.78) 


Thus asymptotically for large |q?|, the coupling will evolve to the “fixed point 
a*. In this case, at sufficiently large —q?, the integral in (15.76) can be eval- 
uated by setting a(t’) = a*, and Rip will scale with an anomalous dimension 
“o(a*) determined by the fixed point value of a. The behaviour of such an 
a is shown in figure 15.6(a). We emphasize that there is no reason to believe 
that the QED £ function actually does behave like this. 

The point a* in figure 15.6(a) is called an ultraviolet-stable fixed point: 
a ‘flows’ towards it at large |q?|. In the case of QCD, the £ function starts 
out negative, so that the corresponding behaviour (with a zero at a at Æ 0) 
would look like that shown in figure 15.6(b). In this case, the reader can check 
(problem 15.5) that až is reached in the infrared limit q? — 0, and so af is 
called an infrared-stable fixed point. Clearly it is the slope of @ near the fixed 
point that determines whether it is u-v or i-r stable. This applies equally to 
a fixed point at the origin, so that QED is i-r stable at a = 0 while QCD is 
u-v stable at as = 0. 
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We must now point out to the reader an error in the foregoing analysis, in 
the case of a gauge theory. The quantity Z is not gauge invariant in QED 
(or QCD), and hence y2 depends on the choice of gauge. This is really no 
surprise, because the full fermion propagator itself is not gauge invariant (the 
free-field propagator is gauge invariant, of course). What ultimately matters 
is that the complete physical amplitude for any process, at a given order of 
a, be gauge invariant. Thus the analysis given above really only applies — in 
this simple form — to non-gauge theories, such as the ABC model of chapter 
6, or to gauge-invariant quantities. 

This is an appropriate point at which to consider the treatment of quark 
masses in the RGE-based approach. Up to now we have simply assumed 
that the relevant |q?| is very much greater than all quark masses, the latter 
therefore being neglected. While this may be adequate for the light quarks u, 
d, s, it seems surely a progressively worse assumption for c, b and t. However, 
in thinking about how to re-introduce the quark masses into our formalism, 
we are at once faced with a difficulty: how are they to be defined? For an 
unconfined particle, such as a lepton, it seems natural to define ‘the’ mass as 
the position of the pole of the propagator (i.e. the ‘on-shell’ value p? = m2), a 
definition we followed in chapters 10 and 11. Significantly, renormalization is 
required (in the shape of a mass counter-term) to achieve a pole at the ‘right’ 
physical mass m, in this sense. But this prescription is inherently perturbative, 
and cannot be used for a confined particle, which never ‘escapes’ beyond the 
range of the non-perturbative confining forces, and whose propagator can 
therefore never approach the form ~ (ø — m)”! of a free particle. 

Our present perspective on renormalization suggests an obvious way for- 
ward. Just as there was, in principle, no necessity to define the QED coupling 
parameter e via an on-shell prescription, so here a mass parameter in the La- 
grangian can be defined in any way we find convenient; all that is necessary 
is that it should be possible to determine its value from some measurable 
quantity (for example, quark masses from lattice QCD predictions of hadron 
masses). Effectively, we are regarding the ‘m in a term such as -mý (x)ý (£) 
as a ‘coupling constant’ having mass dimension 1 (and, after all, the ABC 
coupling itself had mass dimension 1). Incidentally, the operator 4(x)4)(a) is 
gauge invariant, as is any such local operator. Taking this point of view, it is 
clear that a renormalization scale will be involved in such a general definition 
of mass, and we must expect to see our mass parameters ‘evolve’ with this 
scale, just as the gauge (or other) couplings do. In turn, this will get trans- 
lated into a |q?|-dependence of the mass parameters, just as for a(|q?|) and 


as(lq”]). 
The RGE in such a scheme now takes the form 


o 9 9 
PE CA O | ace 21/,,2 _ 
H Ope + Blas) a + > Vilas) + Im(as)m>— RUg"]/u*,as,m/la]) = 0 


(15.79) 
where the partial derivatives are taken at fixed values of the other two vari- 


15.6. QCD corrections to the parton model predictions: scaling violations 135 


ables. Here the y; are the anomalous dimensions relevant to the quantity R, 
and ym is an analogous ‘anomalous mass dimension’, arising from finite shifts 
in the mass parameter when the scale yu? is changed. Just as with the solution 
(15.76) of (15.73), the solution of (15.79) is given in terms of a ‘running mass’ 
m(|q?|). Formally, we can think of ym in (15.79) as analogous to B(as) and 
In m as analogous to as. Then equation (15.41) for the running as, 


Oas(|q? 

A (15.80) 
where t = In(|q?|/p?), becomes 

O(In m(|q? 
AD L yn (aslla): (15.81) 

Equation (15.81) has the solution 

2 2 He 2 2 
mile) = m(u2)exp | amla? rm(as(la") (15.82) 
In u? 


To one-loop order in QCD, Ym(a@s) turns out to be —+as (Peskin and 
Schroeder 1995, section 18.1). Inserting the one-loop solution for a in the 
form (15.53), we find 


| ia (15.83) 


(le) = me) | ps 


where (760)! = 12/(33 — 2N¢). Thus the quark masses decrease logarithmi- 
cally as |q?| increases, rather like as(]q2]). It follows that, in general, quark 
mass effects are suppressed both by explicit m?/|q?| factors, and by the log- 
arithmic decrease given by (15.83). Further discussion of the treatment of 
quark masses is contained in Ellis, Stirling and Webber (1996), section 2.4; 
see also the review by Manohar and Sachrajda in Nakamura et al. 2010. 


E: pp 


15.6 QCD corrections to the parton model predictions 
for deep inelastic scattering: scaling violations 


As we saw in section 9.2, the parton model provides a simple intuitive expla- 
nation for the experimental observation that the nucleon structure functions 
in deep inelastic scattering depend, to a good first approximation, only on 
the dimensionless ratio x = Q?/2Mv, rather than on Q? and y separately; 
this behaviour is referred to as ‘scaling’. Here M is the nucleon mass, and Q? 
and v are defined in (9.7) and (9.8). In this section we shall show how QCD 
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corrections to the simple parton model, calculated using RGE techniques, pre- 
dict observable violations of scaling in deep inelastic scattering. As we shall 
see, comparison between the theoretical predictions and experimental mea- 
surements provides strong evidence for the correctness of QCD as the theory 
of nucleonic constituents. 


15.6.1 Uncancelled mass singularities at order ag. 


The free parton model amplitudes we considered in chapter 9 for deep inelastic 
lepton-nucleon scattering were of the form shown in figure 15.7 (cf figure 9.4). 
The obvious first QCD corrections will be due to real gluon emission by either 
the initial or final quark, as shown in figure 15.8, but to these we must add the 
one-loop virtual gluon processes of figure 15.9 in order (see below) to get rid 
of infrared divergences similar to those encountered in section 14.4.2, and also 
the diagram of figure 15.10, corresponding to the presence of gluons in the 
nucleon. To simplify matters, we shall consider what is called a ‘non-singlet 
structure function’ ENS, such as F;? — FS” in which the (flavour) singlet gluon 
contribution cancels out, leaving only the diagrams of figures 15.8 and 15.9. 

We now want to perform, for these diagrams, calculations analogous to 
those of section 9.2, which enabled us to find the e-N structure functions 
vW, and MW, from the simple parton process of figure 15.7. There are two 
problems here: one is to find the parton level W’s corresponding to figure 
15.8 (leaving aside figure 15.9 for the moment) — cf equations (9.29) and 
(9.30) in the case of the free parton diagram figure 15.7; the other is to relate 
these parton W’s to observed nucleon W’s via an integration over momentum 
fractions. In section 9.2 we solved the first problem by explicitly calculating 
the parton level d20*/4Q%dv and picking off the associated vWi, Wi. In 
principle, the same can be done here, starting from the five-fold differential 
cross section for our e” + q —> e7 +q + g process. However, a simpler — if 
somewhat heuristic — way is available. We note from (9.46) that in general 
Fı = MW, is given by the transverse virtual photon cross section 


mita > Y New (15.84) 


=F 


where W*” was defined in (9.3). Further, the Callan-Gross relation is still 
true (the photon only interacts with the charged partons, which are quarks 
with spin 4 and charge e;), and so 


Fy/v = 2F, = 2MW, =or/(4n?0/2MK). (15.85) 


These formulae are valid for both parton and proton W,’s and W*”"s, with ap- 
propriate changes for parton masses M. Hence the parton level 2f; for figure 
15.8 is just the transverse photon cross section as calculated from the graphs 
of figure 15.11, divided by the factor 4n?0/2MK, where as usual *”” denotes 
kinematic quantities in the corresponding parton process. This cross section, 
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FIGURE 15.7 
Electron-quark scattering via one-photon exchange. 


“SS 


FIGURE 15.8 
Electron-quark scattering with single-gluon emission 


Se 


FIGURE 15.9 
Virtual single-gluon corrections to figure 15.7. 


FIGURE 15.10 
Electron-gluon scattering with qq production. 
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FIGURE 15.11 


Virtual photon processes entering into figure 15.8. 


however, is — apart from a colour factor — just the virtual Compton cross sec- 
tion calculated in section 8.6. Also, taking the same (Hand) convention for 
the individual photon flux factors, 


2MK = 8. (15.86) 


Thus for the parton processes of figure 15.9, 


2, = 6r/(4m2a/2MK) 
P 1 2. A a A (2 
_ 8 4 me;"aas(pL") t 3 2aQ 
= =| deo z: — (-:-3+ — (15.87) 


where, in going from (8.181) to (15.87), we have inserted a colour factor 4 
(problem 14.5 (a)), renamed the variables ê > û,û — Ê in accordance with 
figure 15.11, and replaced a? by e;2aas(y?). 

Before proceeding with (15.87), it is helpful to consider the other part of 
the calculation — namely the relation between the nucleon F} and the parton 
F,. We mimic the discussion of section 9.2, but with one significant difference: 
the quark ‘taken’ from the proton still has momentum fraction y (momentum 
yp), but now its longitudinal momentum must be degraded in the final state 
due to the gluon bremsstrahlung process we are calculating. Let us call the 
quark momentum after gluon emission zyp (figure 15.12). Then, assuming as 
in section 9.2 that it stays on-shell, we have 


Ê +2zyq-p=0 (15.88) 


or 
c=yz, £=Q?/2q:p, =-Q (15.89) 
and we can write (cf (9.31)) 


1 1 
B =2F, = D) aurit) | dz 2F#6(a — yz) (15.90) 


where the f;(y) are the parton distribution functions introduced in section 9.2 
(we often call them q(x) or g(x) as the case may be) for parton type i, and 
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FIGURE 15.12 
The first process of figure 15.11, viewed as a contribution to e~-nucleon scat- 
tering. 


FIGURE 15.13 
Kinematics for the parton process of figure 15.12. 


the sum is over contributing partons. The reader may enjoy checking that 
(15.90) does reduce to (9.34) for free partons by showing that in that case 
2Fi = e26(1 — z) (see Halzen and Martin 1984, section 10.3, for help), so that 
2Fhe = 2 f,(2). 

To proceed further with the calculation (i.e. of (15.87) inserted into (15.90), 
we need to look at the kinematics of the yq > qg process, in the CMS. Re- 
ferring to figure 15.13, we let k,k' be the magnitudes of the CMS momenta 
k,k'. Then 


§ = 4k? = (ypraq)' =QU(1-2)]/2, 2=Q?/(8+Q*) 
Ê = (q—p’')? =—2kk'(1 — cos) = —Q2(1 — c)/22, c= cos0 
a = (q-q)? =—2kk'(1 + cos) = —Q2(1 + c)/22. (15.91) 


We now note that in the integral (15.87) for Fi, when we integrate over 
c = cos0, we shall obtain an infinite result 
1 de 

l-e¢ 


(15.92) 


associated with the vanishing of f in the ‘forward’ direction (i.e. when q and p’ 
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are parallel). This is a divergence of the ‘collinear’ type, in the terminology of 
section 14.4.2 — or, as there, a ‘mass singularity’, occurring in the zero quark 
mass limit. If we simply replace the propagator factor î-! = [(q — p')?]71 by 
[(q — p')? — m?]~+, where m is a quark mass, then (15.92) becomes 


i de 
z J CE (15.93) 


which will produce a factor of the form In(Q?/m?) as m? — 0. Thus m reg- 
ulates the divergence. We have here an uncancelled mass singularity, and it 
violates scaling. This crucial physical result is present in the lowest-order QCD 
correction to the parton model, in this case. As we are learning, such loga- 
rithmic violations of scaling are a characteristic feature of all QCD corrections 
to the free (scaling) parton model. 

We may calculate the coefficient of the In Q? term by retaining in (15.87) 
only the terms proportional to f~t: 


1 2 2 
~i de fas(u*) 4 1+2 
fie | E ae ee 15.94 

1 af El 27 3 1-z pa) 


and so, for just one quark species, this QCD correction contributes (from 
(15.90)) a term 


soe f ata) (Pazo) In(Q?/m?) + C(w/y)} (15.95) 
to 2F,, where , 
Peje £ (E ) (15.96) 


and C(x/y) has no mass singularity. 
Our result so far is therefore that the “free” quark distribution function 
q(x), which depended only on the scaling variable x, becomes modified to 


q(x) + et | YY oy) { Paa(/y) In (Q?/m?) + C(x/y)} (15.97) 


27 Ja Y 
= afc) +H f ay f aer atit Pate) M/m?) 
+ C(2)} (15.98) 


due to lowest-order gluon radiation. Clearly, this corrected distribution func- 
tion violates scaling because of the In Q? term. But the result as it stands 
cannot represent a well-controlled approximation, since it contains divergences 
as z >1 and as m? > 0. 

We postpone discussion of the mass divergence until the next section. The 
divergence as z > 1 is a standard infrared divergence (the quark momentum 
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yzp after gluon emission becomes equal to the quark momentum yp before 
emission), and we expect that it can be cured by including the virtual gluon 
diagrams of figure 15.9, as indicated at the start of the section (and as was 
done analogously in the case of ete” annihilation). This has been verified 
explicitly by Kim and Schilcher (1978) and by Altarelli et al. (1978 a, b; 
1979). Alternatively, we follow the procedure of Altarelli and Parisi (1977). 
First we regulate the divergence as z — 1 by defining a regulated function 
1/(1 — z)4 such that 


where f(z) is any test function sufficiently regular at the end points. Now the 
gluon loops which will cancel the i-r divergence only contribute at z > 1, in 
leading log approximation. Thus the i-r finite version of Pyq has the form 


P 4 142? 


aa(2) = su tl-a) (15.100) 


The coefficient A is determined by the physical requirement that the net 
number of quarks (i.e. the number of quarks minus the number of antiquarks) 
does not vary with Q?. From (15.98) this implies 


[ Pag(z)dz = 0. (15.101) 
0 


Inserting (15.100) into (15.101), and using (15.99), we find (problem 15.6) 


A=?, (15.102) 
so that 
Pag(2) = a) + 25(1 — 2). (15.103) 


The function Pag is called a ‘splitting function’, and it has an impor- 
tant physical interpretation. The quantity as(ju?)/(27) Pgq(z) is, for z < 1, 
the probability that, to first order in as, a quark having radiated a gluon is 
left with a fraction z of its original momentum. Similar functions arise in 
QED in connection with what is called the ‘equivalent photon approximation’ 
(Weizsăcker 1934, Williams 1934, Chen and Zerwas 1975). The application 
of these techniques to QCD corrections to the free parton model is due to 
Altarelli and Parisi (1977), who thereby opened the way to this simpler and 
more physical way of understanding scaling violations, which had previously 
been discussed mainly within the rather technical operator product formalism 
(Wilson 1969). 

We must now find some way of making sense, physically, of the uncancelled 
mass divergence in (15.97). 
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15.6.2 Factorization, and the order a; DGLAP equation 


The key is to realize that when two partons are in the collinear configuration 
their relative momentum is very small, and hence the interaction between 
them is very strong, beyond the reach of a perturbative calculation. This 
suggests that we should absorb such uncalculable effects into a modified dis- 
tribution function q(x, px) given by 


2 as(1?) > dy 2 2 
qlz, up) = az) + —_ y LW) Paa(z/y) {In(up/m?) + C(z/y)) 
(15.104) 
which we have to take from experiment. Note that we have also absorbed the 
non-singular term C(x/y) into q(x, ud). In terms of this quantity, then, we 
have 


Fa(a,Q%) = eaq(x,Q?) (15.105) 
= wed | Paty, ub) fat TO) 
(15.106) 


to this order in a,, and for one quark type. 

This procedure is, of course, very reminiscent of ultraviolet renormaliza- 
tion, in which u-v divergences are controlled by similarly importing some 
quantities from experiment. In this example, we have essentially made use of 
the simple fact that 


In(Q?/m?) =10(Q?/12) + In(uZ/m?). (15.107) 


The arbitrary scale up is analogous to renormalization scale u (which we have 
retained in as(p?)), and is here referred to as a ‘factorization scale’. It is 
the scale entering into the separation in (15.107), between one (uncalculable) 
factor which depends on the i-r parameter m but not on Q?, and the other 
(calculable) factor which depends on Q?. The scale up can be thought of 
as one which separates the perturbative short-distance physics from the non- 
perturbative long-distance physics. Thus partons emitted at small transverse 
momenta < pp (i.e. approximately collinear processes) should be considered 
as part of the hadron structure, and are absorbed into q(x, 42). Partons emit- 
ted at large transverse momenta contribute to the short-distance (calculable) 
part of the cross section. Just as for the renormalization scale, the more terms 
that can be included in the perturbative contributions to the mass-singular 
terms, the weaker the dependence on up will be. We have demonstrated the 
possibility of factorization only to O(as), but proofs to all orders in pertur- 
bation theory exist; reviews are provided by Collins and Soper (1987, 1988). 

Returning now to (15.106), the reader can guess what is coming next: 
we shall impose the condition that the physical quantity F2(x,Q?) must be 
independent of the choice of factorization scale u2. Differentiating (15.106) 
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partially with respect to 2, and setting the result to zero, we obtain (to order 
Qs on the right-hand side) 


Oq(x, pg)  as(u?) / dy 
2 Dale, te) _ dy E 
AT on J, y PoCo). (15.108) 


This equation is the analogue of equation (15.35) describing the running of 
the coupling a, with u?, and is a fundamental equation in the theory of 
perturbative applications of QCD. It is called the DGLAP equation, after 
Dokshitzer (1977), Gribov and Lipatov (1972), and Altarelli and Parisi (1977). 
The above derivation is not rigorous: a more sophisticated treatment (Georgi 
and Politzer 1974, Gross and Wilczek 1974) confirms the result and extends 
it to higher orders. 

Equation (15.108) shows that, although perturbation theory cannot be 
used to calculate the distribution function q(x, 42) at any particular value 
1% = pz, it can be used to predict how the distribution changes (or ‘evolves’) as 
uê varies. (We recall from (15.105) that q(x, uâ) can be found experimentally 
via xq(a, uâ) = 2F2(1,Q? = y2)/e?.) As in the case of o(ete” — hadrons) 
and the scale y?, the choice of factorization scale is arbitrary, and would 
cancel from physical quantities if all powers in the perturbation series were 
included. Truncating at N terms results in an ambiguity of order ash), In 
deep inelastic predictions, the standard choice for scales is p? = pă = Q?. 

The way the non-singlet distribution changes can be understood qualita- 
tively as follows. The change in the distribution for a quark with momentum 
fraction x, which absorbs the virtual photon, is given by the integral over y of 
the corresponding distribution for a quark with momentum fraction y, which 
radiated away (via a gluon) a fraction x/y of its momentum with probabil- 
ity (as/27) Paq(a/y). This probability is high for large momentum fractions: 
high-momentum quarks lose momentum by radiating gluons. Thus there is 
a predicted tendency for the distribution function q(x, 42) to get smaller at 
large x as u? increases, and larger at small x (due to the build-up of slower 
partons), while maintaining the integral of the distribution over x as a con- 
stant. The effect is illustrated qualitatively in figure 15.14. In addition, the 
radiated gluons produce more qq pairs at small x. Thus the nucleon may be 
pictured as having more and more constituents, all contributing to its total 
momentum, as its structure is probed on ever smaller distance (larger ju) 
scales. 

In general, the right-hand side of (15.108) will have to be supplemented 
by terms (calculable from figure 15.10) in which quarks are generated from 
the gluon distribution; the equations must then be closed by a corresponding 
one describing the evolution of the gluon distributions (Altarelli 1982). In the 
now commonly used notation, this generalization of (15.108) reads 


Ofisp(@, Me) as (UE "d 
„poetei y Sebi) | WPO lepunk) (15.109) 
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xq®S(x) 


FIGURE 15.14 
Evolution of the distribution function with p:?. 


where the sum is over quark types q and gluons g, fee is the 7 > 1 splitting 


function to this order, and f;/p is the parton distribution function for partons 


of type i in the proton. In our previous notation, P (2/y) = Pyq(x/y), 
and fa/p(v, up) = q(x, up). The other splitting functions may be found in 
Altarelli (1982). 

Both the splitting functions and expression (15.106) for Fy(x,Q?) can be 
extended to higher orders in a. Thus the perturbative expansion (15.106) 
becomes 


Fola, Q2) = 2 See 2? e FOH (e, Q?, p) frp le] mk), (15.110) 
=0 


where we have chosen u = up. The expansion (15.110) is analogous to (15.63), 
and as in that case the coefficient functions will depend on u2 in such a way 
that, order by order, the 2, dependence will cancel. At zeroth order the 
coefficients are the 2-independent free parton ones, few = e25(1 — z) and 
Go = 0. In most cases the coefficients have been calculated up to order a? 
(Nakamura et al. 2010). 

We ought also to mention that there are in principle non-perturbative 
corrections to both (15.63) and (15.110), which are of order (AZ _/Q*)* and 
(AZ, /Q*) respectively. 
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FIGURE 15.15 

Q?-dependence of the proton structure function F? for various fixed x values 
(Hagiwara et al. 2002). i, is a number depending on the x-bin, ranging from 
iz = 1 (x = 0.85) to iz = 28 (x = 0.000063). Figure reprinted with permission 
from K Hagiwara et al. Phys.Rev. D 66 010001 (2002). Copyright 2002 by 
the American Physical Society. 


15.6.3 Comparison with experiment 


Data on nucleon structure functions do indeed show the trend described in 
the previous section. Figure 15.15 shows the Q?-dependence of the proton 
structure function F} (x, Q?) = Y e?xfi/p(, Q?) for various fixed x values, as 
compiled by B. Foster, A.D. Martin and M.G. Vincter for the 2002 Particle 
Data Group review (Hagiwara et al. 2002). Clearly at larger x (1 > 0.13) the 
function gets smaller as Q? increases, while at smaller x it increases. 
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Fits to the data have been made in various ways. One (theoretically con- 
venient) way is to consider ‘moments’ (Mellin transforms) of the structure 
functions, defined by 


1 
n n-1 
Matt) -f dix” q(x, t), (15.111) 


where we have taken pu? = u2 and introduced the variable t = ln 4%. Taking 
moments of both sides of (15.108) and interchanging the order of the x and y 
integrations, we find 


dMẸ(t 


“de n-1 
yt) wt) | Elaf Pale/y) (15.112) 


Changing the variable to z = z/y in the second integral, and defining‘ 


1 
ig = af dae Pie), (15.113) 
we obtain J o 
M"(t a (t) 
q aS n n 
a > ee aa (t). (15.114) 


Thus the integral in (15.108) — which is of convolution type — has been reduced 
to product form by this transformation. Now we also know from (15.47) and 
(15.48) that 
das 
de 
with Bo = (33 — 2Nf)/127 as usual, to this (one-loop) order. Thus (15.114) 
becomes 


= —Bo0% (15.115) 


dln M? yr 
NS: A: A LL : 15.116 
dln as 87 Bo aq SAY ( ) 


The solution to (15.116) is easily found to be 


das 
M2(t) = MË (to) (eo) l (15.117) 


Applying the prescription (15.99) to Yn, we find (problem 15.9) 


> ES +5 49; (15.118) 


“The notation is not chosen accidentally: the y's are indeed anomalous dimensions of 
certain operators which appear in Wilson’s operator product approach to scaling violations 
(Wilson 1969); interested readers may pursue this with Peskin and Schroeder 1995, chapter 
18. 
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FIGURE 15.16 

Distributions of x times the unpolarized parton distributions f(x, u?) (where 
f = uy, dy, ti, d,s,¢,g) using the MRST2001 parametrization (Martin et al. 
2002) at a scale u? = 10GeV?. Figure reprinted with permission from K 
Hagiwara et al. Phys. Rev. D 66 010001 (2002). Copyright 2002 by the 
American Physical Society. 


and then 


4 
dia = |1- —— 15.119 
11 33 —2N¢ TT rai]. ( ) 


We emphasize again that all the foregoing analysis is directly relevant 
only to distributions in which the flavour singlet gluon distributions do not 
contribute to the evolution equations. In the more general case, analogous 
splitting functions Pag, Peg and Pag will enter, folded appropriately with the 
gluon distribution function g(x, t), together with the related quantities Ya, Ygq 
and yg: Equation (15.108) is then replaced by a 2 x 2 matrix equation for 
the evolution of the quark and gluon moments Mg and Mg. 

Returning to (15.117), one way of testing it is to plot the logarithm of one 
moment, In M q> Versus the logarithm of another, ln Mg, for different n, m 
values. A more direct procedure, applicable to the non-singlet case too of 
course, is to choose a reference point u and parametrize the parton distribu- 
tion functions f;(x, to) in some way. These may then be evolved numerically, 
via the DGLAP equations, to the desired scale. Figure 15.16 shows a typical 
set of distributions at u? = 10 GeV? (Martin et al. 2002). A global numerical 
fit is then performed to determine the best values of the parameters, including 
the parameter Ayqg which enters into as(t). An example of such a fit, due to 
Martin et al. (1994), is shown in figure 15.17. 
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FIGURE 15.17 

Data on the structure function Fə in muon-proton deep inelastic scattering, 
from BCDMS (Benvenuti et al. 1989) and NMC (Amaudruz et al. 1992). 
The curves are QCD fits (Martin et al. 1994) as described in the text. Figure 
reprinted with permission from A D Martin et al. Phys. Rev. D 50 6734 
(1994). Copyright 1994 by the American Physical Society. 


It may be worth pausing to reflect on how far our understanding of struc- 
ture has developed, via quantum field theory, from the simple ‘fixed number 
of constituents” models which are useful in atomic and nuclear physics. When 
nucleons are probed on finer and finer scales, more and more partons (gluons, 
qq pairs) appear, in a way quantitatively predicted by QCD. The precise ex- 
perimental confirmation of these predictions (and many others, as discussed 
by Ellis, Stirling and Webber 1996, for example) constitutes a remarkable vote 
of confidence, by Nature, in relativistic quantum field theory. 
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Problems 
15.1 Verify equation (15.10). 
15.2 Verify equation (15.27). 
15.3 Check that (15.50) can be rewritten as (15.53). 
15.4 (a) Verify (15.61). (b) Show that the next term in the expansion (15.60) 
is 

(b2 — bi) M 

Bo 


where bo = P2/B0. By iteratively solving the resulting modified equation 
(15.60), show that the corresponding correction to (15.61) is 


1 
tgp L? — In L = 1) + bo]. 


15.5 Verify that for the type of behaviour of the 6 function shown in figure 
15.7(b), až is reached as q? > 0. 
15.6 Verify equation (15.102). 


15.7 Check that the electromagnetic charge e has dimension (mass)/? in 
d = 4 — e dimensions. 


15.8 Verify equation (0.20) in appendix O. 
15.9 Verify equation (15.118). 


Taylor & Francis 
Taylor & Francis Group 
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Lattice Field Theory, and the 
Renormalization Group Revisited 


16.1 Introduction 


Throughout this book, thus far, we have relied on perturbation theory as the 
calculational tool, justifying its use in the case of QCD by the smallness of the 
coupling constant at short distances; note, however, that this result itself re- 
quired the summation of an infinite series of perturbative terms. As remarked 
in section 15.3, the concomitant of asymptotic freedom is that as really does 
become strong at small Q?, or at long distances of order ASS ~ 1 fm. Here we 
have no prospect of getting useful results from perturbation theory: it is the 
non-perturbative regime. But this is precisely the regime in which quarks bind 
together to form hadrons. If QCD is indeed the true theory of the interaction 
between quarks, then it should be able to explain, ultimately, the vast amount 
of data that exists in low energy hadronic physics. For example: what are 
the masses of mesons and baryons? Are there novel colourless states such as 
glueballs? Is SU(2)¢ or SU(3)¢ chiral symmetry spontaneously broken? What 
is the form of the effective interquark potential? What are the hadronic form 
factors, in electromagnetic (chapter 9) or weak (chapter 20) processes? 

After more than 30 years of theoretical development, and machine ad- 
vances, numerical simulations of lattice QCD are now yielding precise answers 
to many of these questions, thereby helping to establish QCD as the correct 
theory of the strong interactions of quarks, and also providing reliable input 
needed for the discovery of new physics. Lattice QCD is a highly mature 
field, and many technical details are beyond our scope. Rather, in this chap- 
ter we aim to give an elementary introduction to lattice field theory in general, 
including some important insights that it generates concerning the renormal- 
ization group. We return to QCD in the final section, with some illustrative 
results. 

In thinking about how to formulate a non-perturbative approach to quan- 
tum field theory, several questions immediately arise. First of all, how can we 
regulate the ultraviolet divergences, and thus define the theory, if we cannot 
get to grips with them via the specific divergent integrals supplied by per- 
turbation theory? We need to be able to regulate the divergences in a way 
which does not rely on their appearance in the Feynman graphs of pertur- 
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bation theory. As Wilson (1974, 1975) was the first to propose, one quite 
natural non-perturbative way of regulating ultraviolet divergences is to ap- 
proximate continuous space-time by a discrete lattice of points. Such a lattice 
will introduce a minimum distance — namely the lattice spacing ‘a’ between 
neighbouring points. Since no two points can ever be closer than a, there is 
now a corresponding maximum momentum A = 7/a (see following equation 
(16.6)) in the lattice version of the theory. Thus the theory is automatically 
ultraviolet finite from the start, without presupposing the existence of any 
perturbative expansion; renormalization questions will, however, enter when 
we consider the a dependence of our parameters. As long as the lattice spac- 
ing is much smaller than the physical size of the hadrons one is studying, 
the lattice version of the theory should be a good approximation. Of course, 
Lorentz invariance is sacrificed in such an approach, and replaced by some 
form of hypercubic symmetry; we must hope that for small enough a this will 
not matter. We shall discuss how simple field theories are ‘discretized’ in the 
next section; scalar fields, fermion fields, and gauge fields each require their 
own prescriptions. 

Next, we must ask how a discretized quantum field theory can be formu- 
lated in a way suitable for numerical computation. Any formalism based on 
non-commuting operators seems to be ruled out, since it is hard to see how 
they could be numerically simulated. Indeed, the same would be true of ordi- 
nary quantum mechanics. Fortunately a formulation does exist which avoids 
operators: Feynman’s sum over paths approach, which was briefly mentioned 
in section 5.2.2. This method is the essential starting point for the lattice ap- 
proach to quantum field theory, and it will be introduced in section 16.3. The 
sum over paths approach does not involve quantum operators, but fermions 
still have to be accommodated somehow. The way this is done is briefly 
described in section 16.3: see also appendix P. 

It turns out that this formulation enables direct contact to be made be- 
tween quantum field theory and statistical mechanics, as we shall discuss in 
section 16.3.3. This relationship has proved to be extremely fruitful, allowing 
physical insights and numerical techniques to pass from one subject to the 
other, in a way that has been very beneficial to both. In section 16.4 we make 
a worthwhile detour to explore the physics of renormalization and of the RGE 
from a lattice/statistical mechanics perspective, before returning to QCD in 
section 16.5. 


ÁÁ]. 2 2 rrrr— 
16.2 Discretization 


16.2.1 Scalar fields 


We start by considering a simple field theory involving a scalar field g. Post- 
poning until section 16.3 the question of exactly how we shall use it, we assume 
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that we shall still want to formulate the theory in terms of an action of the 
form 


S= [ate Llo, Vo, ¢). (16.1) 


It seems plausible that it might be advantageous to treat space and time 
as symmetrically as possible, from the start, by formulating the theory in 
‘Euclidean’ space, instead of Minkowskian, by introducing t = —ir; further 
motivation for doing this will be provided in section 16.3. In that case, the 
action (16.1) becomes 


S > -i | aear Lo, 90,192) (16.2) 
= i f zar Lr = iSp. (16.3) 

A typical free scalar action is then 
Selo) = 5 | Par [0.0 + (Vo)? + m0]. (16.4) 


We now represent all of space-time by a finite-volume ‘hypercube’. For 
example, we may have N; lattice points along the x-axis, so that a field (x) 
is replaced by the N, numbers ¢(nia) with na = 0,1,...Ni — 1. We write 
L = Nia for the length of the cube side. In this notation, integrals and 
differentials are replaced by the finite sums and difference expressions 


fu > ad, ; 2 > Prom, + 1) — ó(n1)), (16.5) 


so that a typical integral (in one dimension) becomes 


fa ES] say 5 O = bla? (16.6) 


nı 


As in all our previous work, we can alternatively consider a formulation in 
momentum space, which will also be discretized. It is convenient to impose 
periodic boundary conditions such that (x) = ¢(a + L). Then the allowed 
k-values may be taken to be ky, = 271, /L with vı = —N1/2+1,...0,... N1/2 
(we take N, to be even). It follows that the maximum allowed magnitude of 
the momentum is then 7/a, indicating that a”? is (as anticipated) playing 
the role of our earlier momentum cut-off A. We then write 


1 A E 
O O (16.7) 


Lai 


which has the inverse 


nie 


ia) = (£) A), (16.8) 


nı 
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since (problem 16.1) 


Nı—1 
be ¿rn (1112) /M1 = On bas (16.9) 


n1=0 


1 
N 


Equation (16.9) is a discrete version of the 6-function relation given in (E.25) of 
volume 1. A one-dimensional version of the mass term in (16.4) then becomes 
(problem 16.2) 


5 [ao mata)? > 5m? dá), (16.10) 


while 


Thus a one-dimensional version of the free action (16.4) is 
l=; 4 sin?(k,, a/2) 3 
5 2 Okun) Si +m?| $(—ky,). (16.13) 
kv 
In the continuum case, (16.13) would be replaced by 


> Í E Stw) [k? + m?] b(-k) (16.14) 


as usual, which implies that the propagator in the discrete case is proportional 


to 
aï 


“do 
A me (16.15) 


a 


rather than to [k? + m?| eo (remember we are in one-dimensional Euclidean 
space). The two expressions do coincide in the continuum limit a — 0. The 
manipulations we have been going through will be easily recognized by readers 
familiar with the theory of lattice vibrations and phonons, and lead to a 
satisfactory discretization of scalar fields. For Dirac fields the matter is not 
so straightforward. 


16.2.2 Dirac fields 


The first obvious problem has already been mentioned: how are we to rep- 
resent such entirely non-classical objects, which obey anticommutation rela- 
tions? This is part of the wider problem of representing field operators in 
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a form suitable for numerical simulation, which we defer until section 16.3. 
There is, however, a quite separate problem which arises when we try to repeat 
for the Dirac field the discretization used for the scalar field. 

First note that the Euclidean Dirac matrices "E are related to the usual 
Minkowski ones aie by "Ea 3 = -iya VE = —iyM = l. They satisfy 
fe ma ba = 26,, for u = 1,2,3,4. The Euclidean Dirac Lagrangian is then 
W(x) YEO, + m] (x), which should be written now in Hermitean form 


mitoyo(a) + (bland able) — (B.(e))7EV@)}. (16.16) 


The corresponding ‘one-dimensional’ discretized action is then 
7 a 7, E [Y(n +1) — y(n) 
a | AS 


- > (EE) Pom) as.) 


= aS (msi) ot) + (Blea) Fol +1) Gem + DB) |. 


(16.18) 
In momentum space this becomes (problem 16.3) 
> „n sin(k, a Z 
S bb) jF anta) + m (ku), (16.19) 
kv 


and the inverse propagator is [n Sac) + m] . Thus the propagator itself 


is 


OS y? a / pm + ae (16.20) 


a a? 


But here there is a problem: in addition to the correct continuum limit (a > 0) 
found at k,, — 0, an alternative finite a — 0 limit is found at k,, > T/a 
(consider expanding a”! sin [(r/a — 6)a] for small 5). Thus two modes survive 
as a — 0, a phenomenon known as the ‘fermion doubling problem’. Actually 
in four dimensions there are sixteen such corners of the hypercube, so we have 
far too many degenerate lattice copies (which are called different ‘tastes’, to 
distinguish them from the real quark flavours). 

Various solutions to this problem have been proposed. Wilson (1975), for 
example, suggested adding the extra term 


1 Y Bl) wr +1) + (ny — 1) — 24(n1)] (16.21) 
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to the fermion action in this one-dimensional case, where r is dimensionless. 
Evidently this is a second difference, and it would correspond to the term 


-2ra J dêædr HO + V? yh) (16.22) 


in the four-dimensional continuum action. Note the presence of the lattice 
spacing ‘a’ in (16.22), which ensures its disappearance as a — 0. The higher- 
derivative term (0? + V)y has mass dimension 5, and therefore requires a 
coupling constant with mass dimension -1, i.e. a length in units A = c = 1; 
it is, in fact, a non-renormalizable term. However, if we recall the discus- 
sion of section 11.8 in volume 1, we would expect it to be suppressed at low 
momenta much less than the cut-off 7/a. Hence it is natural to see a cou- 
pling proportional to a appearing in (16.22). (We shall see in section 16.5.3 
how renormalization group ideas provide a different perspective on such non- 
renormalizable interactions, classifying them as ‘irrelevant’). 

How does the extra term (16.21) help the doubling problem? One easily 
finds that it changes the (one-dimensional) inverse propagator to 


o sin(k, 
ine Sint 18) E ee “(1 — cos(k,,@)). (16.23) 


By considering the expansion of the cosine near k,, œ% 0 it can be seen that 
the second term disappears in the continuum limit, as expected. However, 
for ky, ~ T/a it gives a large term of order 4 which adds to the mass m, 
effectively banishing the ‘doubled’ state to a very high mass, far from the 
physical spectrum. 

Unfortunately there is a price to pay. The problem is that, as we learned in 
section 12.3.2, the QCD lagrangian has an exact chiral symmetry for massless 
quarks. To the extent that mu and ma (and ms, but less so) are small on 
a hadronic scale such as Ayqg, we expect chiral symmetry to have important 
physical consequences. These will indeed be explored in chapter 18. For the 
moment, we note merely that it is important for lattice-based QCD calcu- 
lations to be able to deal correctly with the light quarks. Now we cannot 
simply choose the bare Lagrangian mass parameters to be small, and leave it 
at that. In any interacting theory, renormalization effects will cause shifts in 
these masses. In a chirally symmetric theory, or one which is chirally sym- 
metric as a fermion mass goes to zero, such a mass shift is proportional to the 
fermion mass itself; in particular it does not simply add to the mass. We drew 
attention to this fact in the case of the electron mass renormalization in QED, 
in section 11.2. So in chirally symmetric theories, mass renormalizations are 
‘protected’, in this sense. But the modification (16.21), while avoiding phys- 
ical fermion doublers, breaks chiral symmetry badly. This can easily be seen 
by noting (see (12.154) for example) that the crucial property required for 
chiral symmetry to hold is 


Ys P+ Pys = 0, (16.24) 
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where P is the SU(3),-covariant Dirac derivative. Any addition to J which 
is proportional to the unit 4 x 4 matrix will violate (16.24), and hence break 
chiral symmetry. The Lagrangian mass m itself is of this form, and it breaks 
chiral symmetry, but ‘softly’ — i.e. in a way that disappears as m goes to zero 
(thereby preserving the symmetry in this limit). The Wilson addition (16.21) 
also breaks chiral symmetry, but it remains there even as m > 0: it is a ‘hard’ 
breaking. 

This means that in the theory with the Wilson modification (i.e. with 
‘Wilson fermions’) fermion mass renormalization will not be protected by the 
chiral symmetry, so that large additive renormalizations are possible. This 
will require repeated fine-tunings of the bare mass parameters, to bring them 
down to the desired small values. And it turns out that this seriously lengthens 
the computing time. 

Another approach (‘staggered fermions’) was suggested by Kogut and 
Susskind (1975), Banks et al. (1976), and Susskind (1977). This essentially 
involves distributing the 4 spin degrees of freedom of the Dirac field across 
different lattice sites (we shall not need the details). At each site there is now 
a one-component fermion, with the colour degrees of freedom, which speeds 
the calculations. The 16-fold ‘doubling’ degeneracy can be re-arranged as 
four different tastes of 4-component fermions, while retaining enough chiral 
symmetry to forbid additive mass renormalizations. 

Since the different components of the staggered Dirac field now live on 
different sites, they will experience slightly different gauge field interactions. 
(These are of course local in the continuum limit, but the point remains true 
after discretization, as we shall see in the following section.) These interactions 
will mix fields of different tastes, causing new problems, but they can be 
suppressed by adding further terms to the action. There is still the 4-fold 
degeneracy to get rid of, but a trick is available for that, as we shall explain 
in section 16.3. 

One might wonder if a lattice theory with fermions could be formulated 
such that it both avoids doublers and preserves chiral symmetry. For quite 
a long time it was believed that this was not possible — a conclusion which 
was essentially the content of the Nielsen-Ninomaya theorem (Nielsen and 
Ninomaya 1981a, b, c). But more recently a way was found to formulate chiral 
gauge theories with fermions satisfactorily on the lattice at finite spacing a. 
The key is to replace the condition (16.24) by the Ginsparg—Wilson (1982) 
relation 


Vs P+ Pys =a Pys P. (16.25) 


This relation implies (Lüscher 1998) that the associated action has an exact 
symmetry, with infinitesimal variations proportional to 


oy 


15 ( > 5 p) Y (16.26) 


dy 


Y ( — La p) y5. (16.27) 
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The symmetry under (16.26)-(16.27), which is proportional to the infinitesi- 
mal version of (12.152) as a — 0, provides a lattice theory with all the funda- 
mental symmetry properties of continuum chiral gauge theories (Hasenfratz 
et al. 1998). Finding an operator which satisfies (16.25) is, however, not so 
easy — but that problem has now been solved, indeed in three different ways: 
Kaplan’s ‘domain wall’ fermions (Kaplan 1992); ‘classically perfect fermions’ 
(Hasenfratz and Niedermayer 1994); and overlap fermions (Narayanan and 
Neuberger 1993a, b, 1994, 1995). Unfortunately all these proposals are com- 
putationally more expensive than the Wilson or staggered fermion alterna- 
tives. 


16.2.3 Gauge fields 


Having explored the discretization of actions for free scalars and Dirac fermions, 
we must now think about how to implement gauge invariance on the lattice. In 
the usual (continuum) case, we saw in chapter 13 how this was implemented by 
replacing ordinary derivatives by covariant derivatives, the geometrical signif- 

icance of which (in terms of parallel transport) is discussed in appendix N. It 

is very instructive to see how the same ideas arise naturally in the lattice case. 

We illustrate the idea in the simple case of the Abelian U(1) theory, QED. 

Consider, for example, a charged scalar field ¢(a), with charge e. To construct 

a gauge-invariant current, for example, we replaced ¢'0,,¢ by Și (0, +ieA,)¢, 

so we ask: what is the discrete analogue of this? The term ¢1 (x) 2o(x) 

becomes, as we have seen, 


(m) lot +1) oma (16.28) 


in one dimension. We do not expect (16.28) by itself to be gauge invariant, 
and it is easy to check that it is not. Under a gauge transformation for the 
continuous case, we have 


d6(x) 


G(x) > bP) 4(x), Au) > A(x) + ae 


(16.29) 


then ¢'(x)(y) transforms by 
și (x) bly) > PRIM, (16.30) 


and is clearly not invariant. The essential reason is that this operator involves 
the fields at two different points, and so the term $'(n1)¢(n1 + 1) in (16.28) 
will not be gauge invariant either. The discussion in appendix N prepares us 
for this: we are trying to compare two ‘vectors’ (here, fields) at two different 
points, when the ‘coordinate axes’ are changing as we move about. We need 
to parallel transport one field to the same point as the other, before they can 
be properly compared. The solution (N.18) shows us how to do this. Consider 
the quantity e 

O(z,y) = di (vexplic | Ada'|¢(y). (16.31) 


yY 
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Under the gauge transformation (16.29), O(x, y) transforms by 


O(a, y) Es ot (x)e~ie8() otic ty Ade'+iel9(=) 991) eiet U) (4) = O(z, y), 
(16.32) 
and it is therefore gauge invariant. The familiar ‘covariant derivative’ rule can 
be recovered by letting y = «+ dz for infinitesimal dx, and by considering the 
gauge-invariant quantity 


Jim, oe (16.33) 

Evaluating (16.33) one finds (problem 16.4) the result 
gi (x) (+ — ica) (2) (16.34) 
= $ (2)D,p(1) (16.35) 


with the usual definition of the covariant derivative. In the discrete case, we 
merely keep the finite version of (16.31), and replace Și (n1)@(n1 +1) in (16.28) 
by the gauge invariant quantity 


$ (niJU (n1,n1 + Do(n + 1), (16.36) 


where the link variable U is defined by 


nia 
U(n1,ni + 1) = exp ie | Adz’ (16.37) 
(ni+1)a 
Note that 
U(n;,,n1 +1) > exp[—ieA(n1)a] (16.38) 


in the small a limit. _ 5 
Similarly, the free Dirac term Y(nı)yPy(nı + 1) — vía + 1)y EV (n1) in 
(16.18) is replaced by the gauge-invariant term 


vn) y PU (ny, 21 + Db(na + 1) — (na + DEU(na + 1, 21) (m1). (16.39) 


The generalization to more dimensions is straightforward. In the non- 
Abelian SU(2) or SU(3) case, ‘eA’ in (16.38) is replaced by gt*A%(n1) where 
the t's are the appropriate matrices, as in the continuum form of the covariant 
derivative. A link variable U(n2,n1) may be drawn as in figure 16.1. Note 
that the order of the arguments is significant: U(n2,n1) = U7*(n1,n2) = 
Ut(n1,nz) from (16.38), which is why the link carries an arrow. 

Thus gauge invariant discretized derivatives of charged fields can be con- 
structed. What about the Maxwell action for the U(1) gauge field? This does 
not exist in only one dimension (0,,A, — 0,A, cannot be formed), so let us 
move into two. Again, our discussion of the geometrical significance of Fu, as 
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FIGURE 16.1 
Link variable U(ng; 1) in one dimension. 


a curvature guides us to the answer. Consider the product Un of link variables 
around a square path (figure 16.2) of side a (reading from the right): 


Uo = Ul(ng, ny; Nx, Ny+41)U (Ne, Ny+1; Na+, Ny41) 


X U(na+1, Ny+1; Nat Ny)U (Na Ny} Ne, Ny). (16.40) 


It is straightforward to verify, first, that Up is gauge invariant. Under a 
gauge transformation, the link U (Nng+1, ny; Ng, ny), for example, transforms 
by a factor (cf equation (16.32)) 


exp{ie[O(n2+1, ny) — O(a, n], (16.41) 


and similarly for the three other links in Up. In this Abelian case the expo- 
nentials contain no matrices, and the accumulated phase factors cancel out, 
verifying the gauge invariance. Next, let us see how to recover the Maxwell 
action. Adding the exponentials again, we can write 


Un = exp{—ieaA,(nz, ny) — iedAz (Nz, Ny + 1) 
+ ieaA,(nz + 1, ny) + ieaAz (nz, ny) } (16.42) 
= exp (mica? + ny +1) — = 
a 
+ iea? i \ (16.43) 
a 
= exp fica? (Se = 2) \ , (16.44) 


using the derivative definition of (16.5). For small ‘a’ we may expand the 
exponential in (16.44). We also take the real part to remove the imaginary 
terms, leading to 


1 
Y "(1 - Re Un) > a NO ea (Fay)? (16.45) 
[mi Oo 

— OAy _ OA i ; E 
where Fry = e GA as usual. To relate this to the continuum limit 
we must note that we sum over each such plaquette with only one definite 
orientation, so that the sum over plaquettes is equivalent to half of the entire 
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FIGURE 16.2 


A simple plaquette in two dimensions. 


sum. Thus 


Y (1-ReUn) > ; NO ea FR, 


O n1,na 
1 
> ea | J z Feydedy. (16.46) 


(Note that in two dimensions “e” has dimensions of mass.) In four dimensions 
similar manipulations lead to the form 


1 1 
Sp = = Sod — Re Un) > q | Cearr, (16.47) 
O 


for the lattice action, as required. In the non-Abelian case, as noted above, 
‘eA’ is replaced by “gt - A’; for SU(3), the analogue of (16.47) is 


2 
Sy = 7 XO Tr(1 — Re Un), (16.48) 
D 


where the trace is over the SU(3) matrices. 


16.3 Representation of quantum amplitudes 


So (with some suitable fermionic action) we have a gauge-invariant ‘classical’ 
field theory defined on a lattice, with a suitable continuum limit. (Actually, 
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the a — 0 limit of the quantum theory is, as we shall see in section 16.5, 
more subtle than the naive replacements (16.5) because of renormalization 
issues, as should be no surprise to the reader by now). However, we have 
not yet considered how we are going to turn this classical lattice theory into 
a quantum one. The fact that the calculations are mostly going to have to 
be done numerically seems at once to require a formulation that avoids non- 
commuting operators. This is precisely what is provided by Feynman’s sum 
over paths formulation of quantum mechanics and of quantum field theory, 
and it is therefore an essential element in the lattice approach to quantum 
field theory. In this section we give a brief introduction to this formalism, 
starting with quantum mechanics. 


16.3.1 Quantum mechanics 


In section 5.2.2 we stated that in this approach the amplitude for a quantum 
system, described by a Lagrangian L depending on one degree of freedom q(t), 
to pass from a state in which q = gi at t = t; to a state in which q = qf at 
time t = tf, is proportional to (with A = 1) 


lit E@@ aba), (16.49) 
all 2 q(t) he ( j ; : 


where q(t;) = ql, and q(ts) = qf. We shall now provide some justification for 
this assertion. 

We begin by recalling how, in ordinary quantum mechanics, state vectors 
and observables are related in the Schródinger and Heisenberg pictures (see 
appendix I of volume 1). Let q be the canonical coordinate operator in the 
Schrödinger picture, with an associated complete set of eigenvectors |q} such 
that 

âla) = ala) - (16.50) 


The corresponding Heisenberg operator âu(t)is defined by 
Guu(t) = eF (to) ge H(t to) (16.51) 


where H is the Hamiltonian, and to is the (arbitrary) time at which the two 
pictures coincide. Now define the Heisenberg picture state |q:)H by 


ea = AMO . (16.52) 
We then easily obtain from (16.50)-(16.52) the result 


Gu(t)|qe)H = ql91)H , (16.53) 


which shows that |q:)n is the (Heisenberg picture) state which at time t is an 
eigenstate of ĝu (t) with eigenvalue q. Consider now the quantity 


CACAI (16.54) 


16.3. Representation of quantum amplitudes 163 


which is, indeed, the amplitude for the system described by H to go from qi 
at ti to q! at tf. Using (16.52) we can write 


uldi lda = (‘le q) ; (16.55) 


we want to understand how (16.55) can be represented as (16.49). 
We shall demonstrate the connection explicitly for the special case of a 


free particle, for which 
52 
Wa (16.56) 
2m 
For this case, we can evaluate (16.55) directly as follows. Inserting a complete 


set of momentum eigenstates, we obtain! 


‘) = e (a Ip) (pei Tra) 


1 00 
= ete —ip? (tp t:)/2mg —ipgi dp 


—iĤ (ts—ti) iy dp 


(dle 


ES ae la JE A -p -ò| bap. 


To evaluate the integral, we complete the square via the steps 


2 f 
pi(te-ti) eai _ (tezti e _ 2mp- e) 
T p-d) = z hile ae? 
_ ftt _ may md - i)? 
= 2m A te — tj (te — ti)? 
_ te— ti 2 ma-dqy 
= ( T ) p Ooh)” (16.58) 
where ft oh 
eee E (16.59) 
te — ti 


We then shift the integration variable in (16.57) to p”, and obtain 


1 mig — dy / ¡(tf — tip? 
— — d a a Pi 
L Ir P i 2(te — ti) i= eg 2m 


(16.60) 
As it stands, the integral in (16.60) is not well-defined, being rapidly oscillatory 
for large p'. However, it is at this point that the motivation for passing to 


—iA( (te— ti) 


(dle 


¡Remember that (q|p) is the q-space wavefunction of a state with definite momentum p, 
and is therefore a plane wave; we are using the normalization of equation (E.26) in volume 1. 
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‘Euclidean’ space-time arises. If we make the replacement t > —ir, (16.60) 


becomes 
| 1 f_ y 00 -7 12 
d) = exp E q) | f ies - (7e — 7i)p | 
= 3 


and the integral is a simple convergent Gaussian. Using the result 


i diet = E (16.62) 


—Î (te=) 


(ile 


we finally obtain 


—H (t¢—-Ti) 


(16.63) 


có ee), 


G - 2(Tf — Ti) 


iy ~ m 
LAT 2n(TE — Ti) 

We must now understand how the result (16.63) can be represented in the 
form (16.49). In Euclidean space, (16.49) is 


=] dq d 
e = = —|d 16.64 
> on ( PE) r) (16.64) 
paths E 

in the free-particle case. We interpret the 7 integral in terms of a discretization 
procedure, similar to that introduced in section 16.2. We split the interval 


Te — T into N segments each of size e, as shown in figure 16.3. The 7-integral 
in (16.64) becomes the sum 


N j ¡132 
q = g 
my ua (16.65) 


and the ‘sum over paths’, in going from q? = ql at 7 to qN = q at Tẹ, is 
now interpreted as a multiple integral over all the intermediate positions 


q',q’,...,qN~! which paths can pass through at ‘times’ 71, 72,...,7N—1: 
1 (gt aight) | da dg? dg’? 
—— wie — RR —— 16.66 
nal [fox ne aa). aero 


where A(e) is a normalizing factor, depending on €, which is to be determined. 

The integrals in (16.66) are all of Gaussian form, and since the integral 
of a Gaussian is again a Gaussian (cf the manipulations leading from (16.57) 
to (16.60), but without the ‘i’ in the exponents), we may perform all the 
integrations analytically. We follow the method of Feynman and Hibbs (1965), 
section 3.1. Consider the integral over q!: 


ps [ew {-= [(4? = q + (4? — ay) dq’. (16.67) 
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FIGURE 16.3 
A ‘path’ from q? = ql at T; to q = qf at 7, via the intermediate positions 
dq, bas gi at, T1,72)+++) TN—1- 


This can be evaluated by completing the square, shifting the integration vari- 
able, and using (16.62), to obtain (problem 16.5) 


4 = + 
p= (=) “exp [Ze — d] A (16.68) 
m de 
Now the procedure may be repeated for the q? integral 
P= few{-2@-a)?- at PP) ae, (16.69) 
de 2e 
which yields (problem 16.5) 
1 
Are \ ? —m : 
P = | — —(q? — q)?| E 16. 
ES exp | (a d] (16.70) 


As far as the exponential factors in (16.63) in (16.64) are concerned, the 
pattern is now clear: after n — 1 steps we shall have an exponential factor 


exp [—m(q” — q')?/(2ne)] . (16.71) 
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Hence, after N — 1 steps we shall have a factor 
exp [—m(q' — Y /Ar=m)] , (16.72) 


remembering that q = qf and that Tf — 7 = Ne. So we have recovered the 
correct exponential factor of (16.63), and all that remains is to choose A(e) in 
(16.66) so as to produce the same normalization as (16.63). 


The required A(e) is 
| 2e 


as we now verify. For the first (q!) integration, the formula (16.66) contains 
two factors of A7*(€), so that the result (16.68) becomes 


aot oa ee 
E (E) e [Ze dy. (16.74) 


For the second (q?) integration, the accumulated constant factor is 


re (ae (=) Ñ cl i (16.75) 


Proceeding in this way, one can convince oneself that after N — 1 steps, the 
accumulated constant is 


oo = SS > (16.76) 


as in (16.63). 

The equivalence of (16.63) and (16.64) (in the sense e — 0) is therefore 
established for the free-particle case. More general cases are discussed in 
Feynman and Hibbs (1965) chapter 5, and in Peskin and Schroeder (1995) 
chapter 9. The conventional notation for the path-integral amplitude is 


—H(t-7:) 


i — fE Ldr 
(ale dy = f Dare Ee, (16.77) 
where the right-hand side of (16.77) is interpreted in the sense of (16.66). 
We now proceed to discuss further aspects of the path-integral formula- 
tion. Consider the (Euclideanized) amplitude (qije-H (7r-7:)|qi), and insert a 
complete set of energy eigenstates |n) such that H|n) = E,,|n): 


di) = X fnn Prom. (16.78) 


n 


-Ê (te= ri) 


(ale 


Equation (16.78) shows that if we take the limits 7 + —oo, Tẹ > 00, then the 
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state of lowest energy Eo (the ground state) provides the dominant contribu- 
tion. Thus, in this limit, our amplitude will represent the process in which 
the system begins in its ground state |Q) at 7; + —oo, with q = qi, and ends 
in |Q) at Te > 00, with q = qf. 

How do we represent propagators in this formalism? Consider the expres- 
sion (somewhat analogous to a field theory propagator) 


Gi (ta, to) = (a, IT (du (ta)dn(to)) la) > (16.79) 


where T is the usual time-ordering operator. Using (16.51) and (16.52), 
(16.79) can be written, for ty > ta, as 


Galta, to) = (qf je A (tet) Ge A (tota) geil (tats) gi) . (16.80) 


Inserting a complete set of states and Euclideanizing, (16.80) becomes 


image I dgrag quo (af e72 =q") 


x (qe ET |g) (que Hg). (16.81) 


Now, each of the three matrix elements has a discretized representation of the 
form (16.63), with say N, — 1 variables in the interval (Ta, Ti), Na—1 in (Tp, Ta) 
and N3 — 1 in (74,7). Each such representation carries one ‘surplus’ factor 
of [A(e)]~', making an overall factor of [A(e)]~°. Two of these factors can be 
associated with the dg*dq° integration in (16.81), so that we have a total of 
N + N2 + N3 — 1 properly normalized integrations, and one ‘surplus’ factor 
[A(e)]7! as in (16.66). If we now identify q(Ta) = q%, q(tm) = @’, it follows 
that (16.81) is simply 


Í Da(r)q(ta)a(tye HE. (16.82) 


In obtaining (16.82), we took the case 7 > Ta. Suppose alternatively that 
Ta > Tp. Then the order of 7, and 7, inside the interval (7;,7¢) is simply 
reversed, but since q” and q? in (16.81), or q(7a) and q(t) in (16.82), are 
ordinary (commuting) numbers, the formula (16.82) is unaltered, and actually 
does represent the matrix element (16.79) of the time-ordered product. 


16.3.2 Quantum field theory 


The generalizations of these results to the field theory case are intuitively 
clear. For example, in the case of a single scalar field ¢(x), we expect the 
analogue of (16.82) to be (cf (16.4)) 


Joa Dolz) 6a) $ (as) jap |- UN Lolo, Vo, 0-6) daa, (16.83) 
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where 


dizg = dixdr, (16.84) 


and the boundary conditions are given by $(x,7;) = ¢i(x), ¢(x, Tt) = f(a), 
P(x, Ta) = ° (£) and ¢(x, 7) = (x), say. In (16.83), we have to understand 
that a four-dimensional discretization of Euclidean space-time is implied, the 
fields being Fourier-analyzed by four-dimensional generalizations of expres- 
sions such as (16.7). Just as in (16.79)-(16.82), (16.83) is equal to 


(rr {bu (wa)du(ae)} ema). (16.85) 


Taking the limits 1, > —00, tT — œ will project out the configuration of 
lowest energy, as discussed after (16.78), which in this case is the (interacting) 
vacuum state |Q). Thus in this limit the surviving part of (16.85) is 


(9 (7) Me Pe (QUT {bulra ule) } |e“ 07 (Qlgi(2)) (16:86) 


with 7 — oo. The exponential and overlap factors can be removed by dividing 
by the same quantity as (16.85) but without the additional fields p(x,) and 
(ap). In this way, we obtain the formula for the field theory propagator in 
four-dimensional Euclidean space: 


A > ~ J Do (xa) o(as)exp[— f7, Lad’ zg] 

(OT {bu(va)ou(as)} |Q) = lim peas T., Lgdizy] 

(16.87) 
Vacuum expectation values of time-ordered products of more fields will simply 
have more factors of ¢ on both sides. 

Perturbation theory can be developed in this formalism also. Suppose 
La = LY+ Lit, where LS, describes a free scalar field and Lit is an interaction, 
for example Apt. Then, assuming A is small, the exponential in (16.87) can 
be expressed as 


exp |- fatza (£9 +e) = (exp = [ater ch) (1 = A | atang" +...) 

(16.88) 
and both numerator and denominator of (16.87) may be expressed as vevs of 
products of free fields. Compact techniques exist for analyzing this formula- 
tion of perturbation theory (Ryder 1985, chapter 6, Peskin & Schroeder 1995, 
chapter 9), and one finds exactly the same ‘Feynman rules’ as in the canonical 
(operator) approach. 

In the case of gauge theories, we can easily imagine a formula similar to 
(16.87) for the gauge field propagator, in which the integral is carried out over 
all gauge fields A,,(x) (in the U(1) case, for example). But we already know 
from chapter 7 (or from chapter 13 in the non-Abelian case) that we shall 
not be able to construct a well-defined perturbation theory in this way, since 
the gauge field propagator will not exist unless we ‘fix the gauge’ by imposing 
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some constraint, such as the Lorentz gauge condition. Such constraints can 
be imposed on the corresponding path integral, and indeed this was the route 
followed by Faddeev and Popov (1967) in first obtaining the Feynman rules 
for non-Abelian gauge theories, as mentioned in section 13.5.3. 

In the discrete case, the appropriate integration variables are the link vari- 
ables U(1;) where l; is the itè link. They are elements of the relevant gauge 
group — for example U(n1,n1 + 1) of (16.3.1) is an element of U(1). In the 
case of the unitary groups, such elements typically have the form (cf (12.35)) 
~ exp(i Hermitean matrix), where the ‘Hermitean matrix’ can be parametrized 
in some convenient way — for example, as in (12.31) for SU(2). In all these 
cases, the variables in the parametrization of U vary over some bounded do- 
main (they are essentially ‘angle-type’ variables, as in the simple U(1) case), 
and so, with a finite number of lattice points, the integral over the link vari- 
ables is well-defined without gauge-fixing. The integration measure for the 
link variables can be chosen so as to be gauge invariant, and hence provided 
the action is gauge invariant, the formalism provides well-defined expressions, 
independently of perturbation theory, for vevs of gauge invariant quantities. 

There remains one more conceptual problem to be addressed in this ap- 
proach: namely, how are we to deal with fermions? It seems that we must 
introduce new variables which, though not quantum field operators, must 
nevertheless anticommute with each other. Such ‘classical’ anticommuting 
variables are called Grassmann variables, and are briefly described in ap- 
pendix P. Further details are contained in Ryder (1985) and in Peskin and 
Schroeder (1995) section 9.5). For our purposes, the important point is that 
the fermion Lagrangian is bilinear in the (Grassmann) fermion fields y, the 
fermionic action for one flavour having the form 


Sue = ato MUn (16.89) 


where Mç is a matrix representing the Dirac operator i D—ms in its discretized 
and Euclideanized form. This means that in a typical fermionic amplitude of 
the form (cf the denominator of (16.87)) 


Zu = | DărDiespl- Su]. (16.90) 


one has essentially an integral of Gaussian type (albeit with Grassmann vari- 
ables), which can actually be performed analytically?. The result is simply 
det [M;(U)], the determinant of the Dirac operator matrix. For N flavours, 
this easily generalizes to 


Î [ detm(0). (16.91) 


2See appendix P. 
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Now we may write 


Ne 
| | detM:(U) = exp Y IndetM;(U)| , (16.92) 
f=1 f 
so that the effect of Nf fermions is to contribute an additional term 
Ses(U) = — Y Im det[M+(U)] (16.93) 
f 


to the gluonic action. But although formally correct, this fermionic contribu- 
tion is computationally very time-consuming to include. Until the mid-1990s 
it could not be done, and instead calculations were made using the quenched 
approximation, in which the determinant is set equal to a constant indepen- 
dent of the link variables U. This is equivalent to the neglect of closed fermion 
loops in a Feynman graph approach, i.e. no vacuum polarization insertions 
on virtual gluon lines. Vacuum polarization amplitudes typically behave as 
q2/mâ for q? < mz, where q is the momentum flowing into the loop (see equa- 
tion (11.39), for example, in the case of QED). The quenched approximation 
is therefore poorer for the light quarks u, d and s. 

By the later 1990s it was possible to include the determinant provided the 
quark masses were not too small: the computation slowed down seriously for 
light quark masses. So calculations were done for unphysically large values of 
Mu, Ma and ms, and the results extrapolated towards the physical values. 

Beginning in the early 2000s, however, more precise calculations with sub- 
stantially lighter quark masses became possible, using the staggered fermion 
formulation discussed in section 16.2.2. It will be recalled that this saves a 
factor of four in the number of degrees of freedom. But there is still the re- 
maining problem of the four unwanted additional ‘tastes’. If these tastes are 
degenerate, as they would be in the continuum limit, then we can use the 
simple trick of replacing Se¢(U) by ¿Se (U), which means that we take the 
fourth root of the staggered fermion determinant. The true physical (non- 
degenerate) quark flavour multiplicity still remains, of course, and we arrive 
at 


Soff, stag. ag In det { Mstag. u(U) Mstag. a(U) Mstag. UA (16.94) 


Unfortunately, things are not so simple away from the continuum limit, at 
finite lattice spacing a. Bernard, Golterman and Shamir (2006) pointed out 
that the quantity 

[det Mstag.(U)}1/4 (16.95) 


cannot be represented by a local single-taste theory except in the continuum 
limit: at finite a, it represents a non-local single-taste action. Locality is a 
very fundamental property of all successful quantum field theories, and its 
recovery from (16.95) in the limit a > 0 is not obvious. We refer to Sharpe 
(2006) for a full discussion, and further references. Meanwhile, as we shall see 
in section 16.6, some of the currently (in 2011) most accurate published results 
in lattice QCD are using staggered fermions with the ‘rooting’ procedure. 
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16.3.3 Connection with statistical mechanics 


Not the least advantage of the path integral formulation of quantum field 
theory (especially in its lattice form) is that it enables a highly suggestive 
connection to be set up between quantum field theory and statistical me- 
chanics. We introduce this connection now, by way of a preliminary to the 
discussion of renormalization in the following section. 

The connection is made via the fundamental quantity of equilibrium sta- 
tistical mechanics, the partition function Z defined by 


Z= > ep (5). (16.96) 


configurations 


which is simply the ‘sum over states’ (or configurations) of the relevant de- 
grees of freedom, with the Boltzmann weighting factor. H is the classical 
Hamiltonian evaluated for each configuration. Consider, for comparison, the 
denominator in (16.87), namely 


Zo = fos exp(— Seg), (16.97) 
where 
Sg = | rate = | ata (300.0) + (Voy + sme? + roth (16.98) 


in the case of a single scalar field with mass m and self-interaction Apt. The 
Euclideanized Lagrangian density Lg is like an energy density: it is bounded 
from below, and increases when the field has large magnitude or has large 
gradients in 7 or a. The factor exp(—Sp) is then a sensible statistical weight 
for the fluctuations in ġ, and Z¿ may be interpreted as the partition function 
for a system described by the field degree of freedom ¢, but of course in four 
‘spatial’ dimensions. 

The parallel becomes perhaps even stronger when we discretize space-time. 
In an Ising model (see the following section), the Hamiltonian has the form 


Hao) 5nSn41, (16.99) 


where J is a constant, and the sum is over lattice sites n, the system variables 
taking the values +1. When (16.99) is inserted into (16.96), we arrive at 
something very reminiscent of the ¢(n1)¢(n1 + 1) term in (16.6). Naturally, 
the effective ‘Hamiltonian’ is not quite the same — though we may note that 
Wilson (1971b) argued that in the case of a ¢* interaction the parameters can 
be chosen so as to make the values ¢ = +1 the most heavily weighted in Sp. 
Statistical mechanics does, of course, deal in three spatial dimensions, not 
the four of our Euclideanized space-time. Nevertheless, it is remarkable that 


172 16. Lattice Field Theory, and the Renormalization Group Revisited 


quantum field theory in three spatial dimensions appears to have such a close 
relationship to equilibrium statistical mechanics in four spatial dimensions. 

One insight we may draw from this connection is that, in the case of pure 
gauge actions (16.47) or (16.48), the gauge coupling is seen to be analogous 
to an inverse temperature, by comparison with (16.96). One is led to wonder 
whether something like transitions between different ‘phases’ exist, as coupling 
constants (or other parameters) vary — and, indeed, such changes of ‘phase’ 
can occur. 

A second point is somewhat related to this. In statistical mechanics, an 
important quantity is the correlation length €, which for a spin system may 
be defined via the spin-spin correlation function 


G(æ) = (s(a)s(0)) = Y s(w)s(0)e "T, (16.100) 


all s(L) 


where we are once more reverting to a continuous æ variable. For large ||, 


this takes the form i le! 
—|a 

G(x) x — exp (=a) ; (16.101) 
|x| ECT) 


The Fourier transform of this (in the continuum limit) is 


Ğ(k?) œ (k? + ED), (16.102) 
as we learned in section 1.3.3. Comparing (16.100) with (16.87), it is clear 
that (16.100) is proportional to the propagator (or Green function) for the field 
s(x); (16.102) then shows that €~1(T) is playing the role of a mass term m. 
Now, near a critical point for a statistical system, correlations exist over very 
large scales £ compared to the inter-atomic spacing a; in fact, at the critical 
point £(T.) ~ L, where L is the size of the system. In the quantum field 
theory, as indicated earlier, we may regard a”! as playing a role analogous to 
a momentum cut-off A, so the regime € > a is equivalent to m < A, as was 
indeed always our assumption. Thus studying a quantum field theory this way 
is analogous to studying a four-dimensional statistical system near a critical 
point. This shows rather clearly why it is not going to be easy: correlations 
over all scales will have to be included. At this point, we are naturally led to 
the consideration of renormalization in the lattice formulation. 


i FE 


16.4 Renormalization, and the renormalization group, 
on the lattice 


16.4.1 Introduction 


In the continuum formulation which we have used elsewhere in this book, 
fluctuations over short distances of order A~! generally lead to divergences 
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in the limit A — oo, which are controlled (in a renormalizable theory) by 
the procedure of renormalization. Such divergent fluctuations turn out, in 
fact, to affect a renormalizable theory only through the values of some of 
its parameters and, if these parameters are taken from experiment, all other 
quantities become finite, even as A > oo. This latter assertion is not easy 
to prove, and indeed is quite surprising. However, this is by no means all 
there is to renormalization theory: we have seen the power of ‘renormal- 
ization group’ ideas in making testable predictions for QCD. Nevertheless, 
the methods of chapter 15 were rather formal, and the reader may well 
feel the need of a more physical picture of what is going on. Such a pic- 
ture was provided by Wilson (1971a) (see also Wilson and Kogut 1974), us- 
ing the ‘lattice + path integral’ approach. Another important advantage 
of this formalism is, therefore, precisely the way in which, thanks to Wil- 
son’s work, it provides access to a more intuitive way of understanding renor- 
malization theory. The aim of this section is to give a brief introduction 
to Wilson’s ideas, so as to illuminate the formal treatment of the previous 
chapter. 


In the ‘lattice + path integral’ approach to quantum field theory, the 
degrees of freedom involved are the values of the field(s) at each lattice site, 
as we have seen. Quantum amplitudes are formed by integrating suitable 
quantities over all values of these degrees of freedom, as in (16.87) for example. 
From this point of view, it should be possible to examine specifically how the 
‘short distance’ or ‘high momentum’ degrees of freedom affect the result. In 
fact, the idea suggests itself that we might be able to perform explicitly the 
integration (or summation) over those degrees of freedom located near the 
cutoff A in momentum space, or separated by only a lattice site or two in 
co-ordinate space. If we can do this, the result may be compared with the 
theory as originally formulated, to see how this ‘integration over short-distance 
degrees of freedom’ affects the physical predictions of the theory. Having done 
this once, we can imagine doing it again — and indeed iterating the process, 
until eventually we arrive at some kind of ‘effective theory’ describing physics 
in terms of ‘long-distance’ degrees of freedom. 


There are several aspects of such a programme which invite comment. 
First, the process of ‘integrating out’ short-distance degrees of freedom will 
obviously reduce the number of effective degrees of freedom, which is neces- 
sarily very large in the case € >> a, as envisaged above. Thus it must be 
a step in the right direction. Secondly, the above sketch of the ‘integrating 
out’ procedure suggests that, at any given stage of the integration, we shall 
be considering the system as described by parameters (including masses and 
couplings) appropriate to that scale, which is of course strongly reminiscent 
of RGE ideas. And thirdly, we may perhaps anticipate that the result of 
this ‘integrating out’ will be not only to render the parameters of the theory 
scale-dependent, but also, in general, to introduce new kinds of effective in- 
teractions into the theory. We now consider some simple examples which we 
hope will illustrate these points. 


174 16. Lattice Field Theory, and the Renormalization Group Revisited 


FIGURE 16.4 
A portion of the one-dimensional lattice of spins in the Ising model. 


16.4.2 Two one-dimensional examples 


Consider first a simple one-dimensional Ising model with Hamiltonian (16.99) 
and partition function 


N-1 
Z = VW" exp E y = , (16.103) 
{sn} n=0 


where K = J/(kgT) > 0. In (16.103) all the sn variables take the values 
+1 and the ‘sum over {s,,}’ means that all possible configurations of the N 
variables sy, 31, s2,..., s-a are to be included. The spin sn is located at the 
lattice site na, and we shall (implicitly) be assuming the periodic boundary 
condition Sn = Sy+n. Figure 16.4 shows a portion of the one-dimensional 
lattice with the spins on the sites, each site being separated by the lattice 
constant a. Thus, for the portion {sy-—1, $0, ... s4} we are evaluating 


5 exp|K (sw -180 + 8981 + $182 + S283 + s384)] A (16.104) 


SN—1,;50,81,52,83,84 


Now suppose we want to describe the system in terms of a ‘coarser’ lattice, 
with lattice spacing 2a, and corresponding new spin variables s/,. There are 
many ways we could choose to describe the s/,, but here we shall only consider 
a very simple one (Kadanoff 1977) in which each s’, is simply identified with 
the s, at the corresponding site (see figure 16.5). For the portion of the lattice 
under consideration, then, (16.104) becomes 


5 exp [KE (sn—189 + 8981 + s151 + 5153 + 8382)] . (16.105) 


A: + + 
SN—1;59)81,;81 583,89 


If we can now perform the sums over s; and s3 in (16.105), we shall end up 
(for this portion) with an expression involving the ‘effective’ spin variables 
So, $, and sh, situated twice as far apart as the original ones, and therefore 
providing a more ‘coarse grained’ description of the system. Summing over sı 
and s3 corresponds to ‘integrating out’ two short-distance degrees of freedom 
as discussed earlier. 

In fact, these sums are easy to do. Consider the quantity exp(Ks¿s1), 
expanded as a power series: 

2 K3 


K 
exp(Ksps1) = 1 + Ksosi + OF + ar (8081) +... (16.106) 
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SN So Sy Sy S3 Sa Ss 


FIGURE 16.5 

A ‘coarsening’ transformation applied to the lattice portion shown in figure 
16.4. The new (primed) spin variables are situated twice as far apart as the 
original (unprimed) ones. 


where we have used (ssi)? = 1. It follows that 


exp(K' ss.) = cosh K (1 + sġsı tanh K), (16.107) 
0 0 


and similarly 
exp(Ks1s,) = cosh K (1 + 51s tanh K). (16.108) 


Thus the sum over sı is 


SY" cosh’ K (1+ ss; tanh K + s,s) tanh K + ss! tanh” K). (16.109) 


sy=t1 


Clearly, the terms linear in sı vanish after summing, and the sı sum becomes 
just 
2cosh” K (1+ ss! tanh? K) . (16.110) 


Remarkably, (16.110) contains a new ‘nearest-neighbour’ interaction, 5654, 
just like the original one in (16.103), but with an altered coupling (and a dif- 
ferent spin-independent piece). In fact, we can write (16.110) in the standard 
form 

exp [gi(K) + K's9s'] (16.111) 


and then use (16.107) to set 


tanh K’ = tanh? K (16.112) 
and identify 
2 cosh? K 
K) = n | ———— 16.113 
n(K) n( cosh A” ) ( ) 


Exactly the same steps can be followed through for the sum on s3 in (16.105), 
and indeed for all the sums over the ‘integrated out’ spins. The upshot is 
that, apart from the accumulated spin-independent part, the new partition 
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function, defined on a lattice of size 2a, has the same form as the old one, but 
with a new coupling K’ related to the old one K by (16.112). 

Equation (16.112) is an example of a renormalization transformation: the 
number of degrees of freedom has been halved, the lattice spacing has doubled, 
and the coupling K has been renormalized to K”. 

It is clear that we could apply the same procedure to the new Hamiltonian, 
introducing a coupling K” which is related to K’ , and thence to K by 


tanh K” = (tanh K’)? = (tanh K)*. (16.114) 


This is equivalent to iterating the renormalization transformation; after n 
iterations, the effective lattice constant is 2”a, and the effective coupling is 
given by 

tanh K™ = (tanh K)”. (16.115) 


The successive values K’, K”,... of the coupling under these iterations can 
be regarded as a ‘flow’ in the (one-dimensional) space of K-values: a renor- 
malization flow. 

Of particular interest is a point (or points) K* such that 


tanh K* = tanh? K*. (16.116) 


This is called a fized point of the renormalization tranformation. At such a 
point in K-space, changing the scale by a factor of 2 (or 2” for that matter) 
will make no difference, which means that the system must be in some sense 
ordered. Remembering that K = J/(kgT), we see that K = K* when the 
temperature is ‘tuned’ to the value T = T* = J/(kgK*). Such a T* would 
be the temperature of a critical point for the thermodynamics of the system, 
corresponding to the onset of ordering. In the present case, the only fixed 
points are K* = oo and K* =0. Thus there is no critical point at a non-zero 
T*, and hence no transition to an ordered phase. However, we may describe 
the behaviour as T — 0 as ‘quasi-critical’. For large K, we may use 


tanh K = 1-20 E (16.117) 


to write (16.115) as 
1 
KM =K- zmn, (16.118) 


which shows that K” changes only very slowly (logarithmically) under itera- 
tions when in the vicinity of a very large value of K, so that this is ‘almost’ 
a fixed point. 

We may represent the flow of K, under the renormalization transformation 
(16.115), as in figure 16.6. Note that the flow is away from the quasi-fixed 
point at K* = 00 (T = 0) and towards the (non-interacting) fixed point at 
K*=0. 

A renormalization transformation which has a fixed point at a finite (nei- 
ther zero nor infinite) value of the coupling is clearly of greater interest, since 
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FIGURE 16.6 
‘Renormalization flow’: the arrows show the direction of flow of the coupling 
K as the lattice constant is increased. The starred values are fixed points. 


FIGURE 16.7 
The renormalization flow for the transformation (16.120). 


this will correspond to a critical point at a finite temperature. A simple such 
example given by Kadanoff (1977) is the transformation 
1 


K' = ¿QKY (16.119) 


for a doubling of the effective lattice size, or 
1 
k™) = 5 (2K)" (16.120) 


for n such iterations. The model leading to (16.120) involves fermions in one 
dimension, but the details are irrelevant to our purpose here. The renormal- 
ization transformation (16.120) has three fixed points: K* = 0, K* = 00 and 
the finite point K* = 3. The renormalization flow is shown in figure 16.7. 

The striking feature of this flow is that the motion is always away from 
the finite fixed point, under successive iterations. This may be understood by 
recalling that at the fixed point (which is a critical point for the statistical 
system) the correlation length € must be infinite (as L —> oo). As we iterate 
away from this point, € decreases and we leave the fixed (or critical) point. 
For this model, € is given by Kadanoff (1977) as 


a 
$= |n 2K | 


(16.121) 


which indeed goes to infinity at A = 2. 


16.4.3 Connections with particle physics 


Let us now begin to think about how all this may relate to the treatment of 
the renormalization group in particle physics, as given in the previous chapter. 
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BK) 


FIGURE 16.8 
The f-function of (16.124); the arrows indicate increasing f. 


First, we need to consider a continuous change of scale, say by a factor of f. 
In the present model, the transformation (16.120) then becomes 


K(fa) = EKO). (16.122) 
Differentiating (16.122) with respect to f, we find 
ae = K(fa)n[2K(fa)]. (16.123) 


We may reasonably call (16.123) a renormalization group equation, describing 
the ‘running’ of K(fa) with the scale f, analogous to the RGE’s for a and as 
considered in chapter 15. In this case, the P-function is 


B(K) = KIn(2K), (16.124) 


which is sketched in figure 16.8. The zero of 8 is indeed at the fixed (critical) 
point K = 2, and this is an infrared unstable fixed point, the flow being away 
from it as f increases. 

The foregoing is exactly analogous to the discussion in section 15.5: see in 
particular figure 15.6 and the related discussion. Note, however, that in the 
present case we are considering rescalings in position space, not momentum 
space. Since momenta are measured in units of a~', it is clear that scaling 
a by f is the same as scaling k by fT} = t, say. This will produce a change 
in sign in dK/dt relative to dK/df, and accounts for the fact that K = 3 is 
an infrared unstable fixed point in figure 16.8, while az is an infrared stable 
fixed point in figure 15.6(b). Allowing for the change in sign, figure 16.8 is 
quite analogous to figure 15.6(a). 
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We have emphasized that, at a critical point, and in the continuum limit, 
the correlation length £ > oo, or equivalently the mass parameter (cf (16.102)) 
m = €~! > 0. In this case, the Fourier transform of the spin-spin correlation 
function should behave as i 

G(k2) « z= (16.125) 
This is indeed the k?-dependence of the propagator of a free, massless scalar 
particle, but — as we learned for the fermion propagator in section 15.5 — it 
is no longer true in an interacting theory. In the interacting case, (16.125) 
generally becomes modified to 


~ 1 
2 — 
G(k?) x DT: (16.126) 
or equivalently 
1 


|x| 


in three spatial dimensions, and in the continuum limit. Thus, at a critical 
point, the spin-spin correlation function exhibits scaling under the transforma- 
tion x’ = fx, but it is not free-field scaling. Comparing (16.126) with (15.75), 
we see that 7/2 is precisely the anomalous dimension of the field s(x), so — 
just as in section 15.5 — we have an example of scaling with anomalous di- 
mension. In the statistical mechanics case, 7 is a critical exponent, one of a 
number of such quantities characterizing the critical behaviour of a system. 
In general, y will depend on the coupling constant n(K): at a non-trivial 
fixed point, n will be evaluated at the fixed point value K*, y(K*). Enormous 
progress was made in the theory of critical phenomena when the powerful 
methods of quantum field theory were applied to calculate critical exponents 
(see for example Peskin & Schroeder 1995, chapter 13, and Binney et al. 
1992). 

In our discussion so far, we have only considered simple models with just 
one ‘coupling constant’, so that diagrams of renormalization flow were one- 
dimensional. Generally, of course, Hamiltonians will consist of several terms, 
and the behaviour of all their coefficients will need to be considered under a 
renormalization transformation. The general analysis of renormalization flow 
in multi-dimensional coupling space was given by Wegner (1972). In simple 
terms, the coefficients show one of three types of behaviour under renormal- 
ization transformations such that a — fa, characterized by their behaviour in 
the vicinity of a fixed point: (i) the difference from the fixed point value grows 
as f increases, so that the system moves away from the fixed point (as in the 
single-coupling examples considered earlier); (ii) the difference decreases as f 
increases, so the system moves towards the fixed point; (iii) there is no change 
in the value of the coupling as f changes. The corresponding coefficients are 
called, respectively, (i) relevant, (ii) irrelevant and (iii) marginal couplings; the 
terminology is also frequently applied to the operators in the Hamiltonians 
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themselves. The intuitive meaning of ‘irrelevant’ is clear enough: the system 
will head towards a fixed point as f — oo whatever the initial values of the 
irrelevant couplings. The critical behaviour of the system will therefore be 
independent of the number and type of all irrelevant couplings, and will be 
determined by the relatively few (in general) marginal and relevant couplings. 
Thus all systems which flow close to the fixed point will display the same 
critical exponents determined by the dynamics of these few couplings. This 
explains the property of universality observed in the physics of phase transi- 
tions, whereby many apparently quite different physical systems are described 
(in the vicinity of their critical points) by the same critical exponents. 

Additional terms in the Hamiltonian are, in fact, generally introduced 
following a renormalization transformation. In the quantum field case, we 
may expect that renormalization transformations associated with a — fa, 
and iterations thereof, will in general lead to an effective theory involving all 
possible couplings allowed by whatever symmetries are assumed to be relevant. 
Thus, if we start with a typical ‘¢*’ scalar theory as given by (16.98), we 
shall expect to generate all possible couplings involving ¢ and its derivatives. 
At first sight, this may seem disturbing: after all, the original theory (in 
four dimensions) is a renormalizable one, but an interaction such as Ag* is 
not renormalizable according to the criterion given in section 11.8 (in four 
dimensions ¢ has mass dimension unity, so that A must have mass dimension 
-2). It is, however, essential to remember that in this ‘Wilsonian’ approach to 
renormalization, summations over momenta appearing in loops do not, after 
one iteration a — fa, run up to the original cut-off value 7/a, but only up 
to the lower cut-off 7/fa. The additional interactions compensate for this 
change. 

In fact, we shall now see how the coefficients of non-renormalizable inter- 
actions correspond precisely to irrelevant couplings in Wilson's approach, so 
that their effect becomes negligible as we iterate to scales much larger than 
a. We consider continuous changes of scale characterized by a factor f, and 
we discuss a theory with only a single scalar field ¢ for simplicity. Imagine, 
therefore, that we have integrated out, in (16.97), those components of (x) 
with a < |x| < fa. We will be left with a functional integral of the form 
(16.97), but with ¢(x) restricted to |x| > fa, and with additional interaction 
terms in the action. In order to interpret the result in Wilson’s terms, we 
must rewrite it so that it has the same general form as the original Zy of 
(16.97). A simple way to do this is to rescale distances by 


(16.128) 


so that the functional integral is now over $(x’) with |x’| > a, as in (16.97). 
We now define the fixed point of the renormalization transformation to be 
that in which all the terms in the action are zero, except the ‘kinetic’ piece; 
this is the ‘free-field’ fixed point. Thus, we require the kinetic action to be 
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unchanged: 
faros Cor = Jano, y 
1 
= Fat tend’) , (16.129) 
from which it follows that ¢’ = fọ. Consider now a term of the form Ag: 
A [tza d = A [ig 16.130 
TE $” = fe IRO . (16.130) 


(16.130) shows that the ‘new’ A’ is related to the old one by A’ = $ and 
in particular that, as f increases, A’ decreases and is therefore an irrelevant 
coupling, tending to zero as we reach large scales. But such an interaction 
is precisely a non-renormalizable one (in four dimensions), according to the 
criterion of section 11.8. The mass dimension of ¢ is unity, and hence that 
of A must be -2 so that the action is dimensionless; couplings with negative 
mass dimensions correspond to non-renormalizable interactions. The reader 
may verify the generality of this result for any interaction with p powers of ¢, 
and q derivatives of o. 
However, the mass term m?¢? behaves differently: 


m? | doe p = mp? | ata p? (16.131) 


showing that m’? = m? f? and the ‘coupling’ m? is relevant, since it grows 


with f?. Such a term has positive mass dimension, and corresponds to a 
‘super-renormalizable’ interaction. Finally, the Agt interaction transforms as 


A / d‘zp pt =X / dia! pi (16.132) 


and so X = A. The coupling is marginal, which may correspond (though 
not necessarily) to a renormalizable interaction. To find out if such couplings 
increase or decrease with f, we have to include higher-order loop corrections. 
The foregoing analysis in terms of the suppression of non-renormalizable in- 
teractions by powers of f—! parallels precisely the similar one in section 11.8. 
We saw that such terms were suppressed at low energies by factors of E/A, 
where A is the cut-off scale beyond which the theory is supposed to fail on 
physical grounds (e.g. A might be the Planck mass). The result is that as 
we renormalize, in Wilson’s sense, down to much lower energy scales, the 
non-renormalizable terms disappear and we are left with an effective renor- 
malizable theory. This is the field theory analogue of ‘universality’. 

These ideas have an important application in lattice QCD. One of the 
reasons for systematic inaccuracies in lattice computations is that the contin- 
uum is being simulated by a lattice of finite spacing. Symanzik (1983) showed 
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that corrections to continuum theory results stemming from finite lattice spac- 
ing could be diminished systematically by the use of lattice actions that also 
include suitable irrelevant terms. This procedure is routinely adopted in ac- 
curate lattice calculations with ‘Symanzik-improved’ actions. 

One further word should be said about terms such as ‘m?¢?’ (which arise 
in the Higgs sector of the Standard Model, for instance). As we have seen, 
m? scales by m? = m?f?, which is a rapid growth with f. If we imagine 
starting at a very high scale, such as 1015 TeV and flowing down to 1 TeV, 
then the ‘initial’ value of m will have to be very finely ‘tuned’ in order to end 
up with a mass of order 1 TeV. Thus, in this picture, it seems unnatural to 
have scalar particles with masses much less than the physical cut-off scale, 
unless some symmetry principle ‘protects’ their light masses. We shall return 
to this problem in section 22.8.1. 

We now return to lattice QCD, with a brief survey of some of the impressive 
results now being obtained numerically. 


E a 


16.5 Lattice QCD 
16.5.1 Introduction, and the continuum limit 


Let us begin by considering some numbers. The lattice must be large enough 
so that the spatial dimension R of the object we wish to describe — say the size 
of a hadron — fits comfortably inside it, otherwise the result will be subject 
to ‘finite size effects’ as the hypercube side length L is varied. We also need 
R > a, or else the granularity of the lattice resolution will become apparent. 
Further, as indicated earlier, we expect the mass m (which is of order R7!) 
to be very much less than a. Thus ideally we need 


az R-1/m< L= Na (16.133) 


so that N must be large. For example, if N = 64 and a ~ 0.1fm the condition 
(16.133) would be reasonably satisfied by a light hadron mass. But remember 
that each field at each lattice point is an independent degree of freedom: deal- 
ing with integrals such as (16.87) presents a formidable numerical challenge. 

Ignoring any statistical inaccuracy, the results will depend on the param- 
eters g and N, where gy is the bare lattice gauge coupling (we assume for 
simplicity that the quarks are massless). Despite the fact that gr, is dimen- 
sionless, we shall now see that its value actually controls the physical size of 
the lattice spacing a, as a result of renormalization effects. The computed 
mass of a hadron M, say, must be related to the only quantity with mass 
dimension, a”!, by a relation of the form 


M= L flg). (16.134) 


Thus in approaching the continuum limit a — 0, we shall also have to change 
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gu suitably, so as to ensure that M remains finite. This is, of course, quite 
analogous to saying that, in a renormalizable theory, the bare parameters of 
the theory depend on the momentum cut-off A in such a way that, as A — 00, 
finite values are obtained for the corresponding physical parameters (see the 
last paragraph of section 10.1.2, for example). In practice, of course, the 
extent to which the lattice ‘a’ can really be taken to be very small is severely 
limited by the computational resources available — that is, essentially, by the 
number of mesh points N. 
Equation (16.134) should therefore really read 


M = "5 (gua) . (16.135) 


As a — 0, M should be finite and independent of a. However, we know that 
the behaviour of gL(a) at small scales is in fact calculable in perturbation 
theory, thanks to the asymptotic freedom of QCD. This will allow us to deter- 
mine the form of f (gL), up to a constant, and lead to an interesting prediction 
for M (equations (16.141)-(16.142)). 

Differentiating (16.135) we find 


dM 1 1 df dgu(a) 
_ dM _ ere A 16.1 
0 = T= Sf (aula) + EO (16.136) 
so that Nay 
GL\a _ 
(a Ta ) TI f (gLla)) . (16.137) 
Meanwhile, the scale dependence of g, is given (to one loop order) by 
dgr (a) _ Bo 3 
sda S qn Ala) > (16.138) 


where the sign is the opposite of (15.47) since a ~ pu! is the relevant scale 
parameter here (compare the comments after equation (16.124)). The inte- 
gration of (16.138) requires, as usual, a dimensionful constant of integration 


(cf (15.53): 


g(a) 1 

RR . 16.1 
dr Bo In(1/a2A?) ed?) 

Equation (16.139) shows that gr (a) tends logarithmically to zero as a — 0, as 

we expect from asymptotic freedom. Ay can be regarded as a lattice equivalent 

of the continuum Ays, and it is defined (at one loop order) by 


1 27 
A = lim =e = ; 16.140 

j a0 a x ( 2) l ) 
Equation (16.140) may also be read as showing that the lattice spacing a must 
go exponentially to zero as gr, tends to zero. Higher-order corrections can of 
course be included. 
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In a similar way, integrating (16.137) using (16.138) gives, in (16.134), 


1 27 
M = constant x |— exp (-5)| 16.141 
E Bogi, ( ) 


= constant x AL. (16.142) 


Equation (16.141) is known as asymptotic scaling: it predicts how any physical 
mass, expressed in lattice units a~', should vary as a function of gg. The 
form (16.142) is remarkable, as it implies that all calculated masses must be 
proportional, in the continuum limit a > 0, to the same universal scale factor 
Az. 

How are masses calculated on the lattice? The principle is very similar to 
the way in which the ground state was selected out as 7; — —00, Tf —> +00 in 
(16.78). Consider a correlation function for a scalar field, for simplicity: 


C(r) = (Qló(x = 0,7)9(0)|Q) 
= Y KQ16(0)/n)]2e%=” . (16.143) 


n 


As T > œ, the term with the minimum value of En, namely En = Mg, will 
survive; Mg can be measured from a fit to the exponential fall-off as a function 
of T. 

The behaviour predicted by (16.141) and (16.142) can be tested in actual 
calculations. A quantity such as the p meson mass is calculated via a corre- 
lation function of the form (16.143), the result being expressed in terms of a 
certain number of lattice units a-l at a certain value of g. By comparison 
with the known p mass, a! can be converted to GeV. Then the calculation 
is repeated for a different gy, value and the new a! (GeV) extracted. A 
plot of Infa~!(GeV)] versus 1/9? should then give a straight line with slope 
27/89 and intercept ln A. Figure 16.9 shows such a plot, taken from Ellis 
et al. (1996), from which it appears that the calculations are indeed being 
performed close to the continuum limit. The value of Ay has been adjusted 
to fit the numerical data, and has the value Ay = 1.74 MeV in this case. This 
may seem alarmingly far from the kind of value expected for Agcp, but we 
must remember that the renormalization schemes involved in the two cases are 
quite different. In fact, we may expect Agcp = 50A; (Montvay and Munster 
(1994), section 5.1.6). 


16.5.2 The static qq potential 


The calculations of m, represented in figure 16.9 were done in the quenched ap- 
proximation. As a first example of a calculation with dynamical (unquenched) 
fermions we show in figure 16.10 a lattice calculation of the static qq potential 
(Allton et al. 2002, UKQCD Collaboration ), using two degenerate flavours of 
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FIGURE 16.9 
In(a~' in GeV) plotted against 1/g?; figure from R K Ellis, W J Stirling and B 


R Webber (1996) QCD and Collider Physics, courtesy Cambridge University 
Press, as adapted from Allton (1995). 


riro 


FIGURE 16.10 

The static QCD potential, expressed in units of ro. The broken curve is the 
functional form (16.147). Figure reprinted with permission from C R Allton 
et al. (UKQCD Collaboration) Phys. Rev. D 65 054502 (2002). Copyright 
2002 by the American Physical Society. 
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dynamical quarks? on a 165 x 32 lattice. As usual, one dimensionful quantity 
has to be fixed in order to set the scale. In the present case this has been 
done via the scale parameter ro of Sommer (1994), defined by 


, dV 


Te o: 
0 
dr |... 


= 1.65. (16.144) 


0 


Applying (16.144) to the Cornell (Eichten et al. 1980) or Richardson (1979) 
phenomenological potentials gives ro ~ 0.49 fm, conveniently in the range 
which is well-determined by cē and bb data. The data are well described by 
the expression 


A 
V(r) =VW+or==, (16.145) 
r 


where in accordance with (16.144) 


==, (16.146) 
To 


and where Vo has been chosen such that V(ro) = 0. Thus (16.145) becomes 
roV (r) = (1.65 — A) (= E 1) <A (2 = 1) . (16.147) 
ro r 


This is — up to a constant — exactly the functional form mentioned in chapter 
1, equation (1.33). The quantity yo (there called b) is referred to as the 
‘string tension’, and has a value of about 465 MeV in the present calculations. 
Phenomenological models suggest a value of around 440 MeV (Eichten et al. 
1980). The parameter A is found to have a value of about 0.3. In lowest- 
order perturbation theory, and in the continuum limit, A would be given by 


one-gluon exchange as 


A= atu) (16.148) 


where u is some energy scale. This would give ag œ 0.22, a reasonable value 
for u ~ 3 GeV. Interestingly, the form (16.147) is predicted by the ‘universal 
bosonic string model’ (Liischer et al. 1980, Liischer 1981), in which A has the 
‘universal’ value 73 = 0.26. 

The existence of the linearly rising term with o > 0 is a signal for confine- 
ment, since — if the potential maintained this form — it would cost an infinite 
amount of energy to separate a quark and an antiquark. But at some point, 
enough energy will be stored in the ‘string’ to create a qq pair from the vac- 
uum: the string then breaks, and the two qq pairs form mesons. There is no 
evidence for string breaking in figure 16.10, but we must note that the largest 
distance probed is only about 1.3 fm. 


3Comparison with matched data in the quenched approximation revealed very little 
difference, in this case. 
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16.5.3 Calculation of a( M3) 


Our second example of a precision lattice calculation with dynamical quarks is 
the determination of as( M2) by Davies et al. (2008) (HPQCD Collaboration). 
The reported value is 

as(MzZ) = 0.1183(8). (16.149) 


The accuracy of this result is extremely impressive, and it implies that this 
determination is an important ingredient in the world average value quoted 
in (15.62). It is worth sketching some of the elements that went into this 
landmark calculation. 

The work used 12 gluon configurations from the MILC collaboration (Aubin 
et al. 2004), and built on a joint effort by several groups (see Davies et al. 
(HPQCD, UKQCD, MILC, and Fermilab collaborations) 2004). Vacuum po- 
larization effects from all three light quarks u, d and s were included, using a 
Symanzik-improved staggered-quark discretization, with rooting. The effects 
of c and b quarks were incorporated using perturbation theory. The strange 
quark mass was physical, while the u and d quark mass (set to be the same) 
was three times too large, but small enough for chiral perturbation theory 
(see chapter 18) to be reliable for extrapolating to the physical mass. 

There were 5 parameters: My = Md, Ms, me, Mp and the bare QCD cou- 
pling gr (or equivalently the lattice spacing a). The mass parameters were 
tuned to reproduce experimentally measured values of m2, 2m% —m2, mp and 
my respectively. The lattice spacing was adjusted to make the Y — Y” mass 
difference agree with experiment (Gray et al. 2005). With the free parameters 
all determined, the simulation accurately reproduced QCD, and predictions 
for physical quantities could proceed. En passant, we show in figure 16.11 re- 
sults obtained (Davies et al. 2004), divided by experimental results, for nine 
different quantities, with and without quark vacuum polarization (left and 
right panels respectively). The values on the left deviate from experiment by 
as much as 10% — 15%; those on the right agree with experiment to within 
systematic and statistical errors of 3% or less. 

To extract a value of the coupling constant, the general strategy is to 
calculate (with the tuned simulation) a non-perturbative numerical value for 
a short-distance quantity, for which perturbation theory should be reliable. 
Then, by comparing the numerically computed value to the known perturba- 
tive expansion, a value of the coupling constant can be found. 

In this case, the quantities calculated were vacuum expectation values of 
small Wilson loop operators Wmn (and related quantities) where 


1 
Win = 3 (OlRe TrPexpl-ig, | A - dz]|0), (16.150) 


where P denotes path ordering, A, = A/2- A, is the QCD (matrix-valued) 
vector potential, and the integral is over a closed ma x na rectangular path, 
not necessarily planar. The 1 x 1 Wilson loop is just the vev of the simple 
plaquette operator Up of section 16.2.3. 
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FIGURE 16.11 

Lattice QCD results divided by experimental results for nine different quan- 
tities, with and without quark vacuum polarization (left and right panels, 
respectively). Figure reprinted with permission from C T H Davies et al. 
(HPQCD Collaboration) Phys. Rev. Lett. 92 022001 (2004). Copyright 2004 
by the American Physical Society. 


In order to compare the numerical evaluation of (16.150) with perturbation 
theory, one has to decide what is a suitable expansion parameter. It was shown 
by Lepage and Mackenzie (1993) that the obvious first choice, the bare lattice 
coupling constant, is generally a poor one due to renormalization effects, even 
for short distance quantities. Instead, a renormalized coupling should be used 
— but this raises the questions of what renormalization scheme to adopt, and 
what scale at which to evaluate the (now running) coupling. In the present 
case, the scheme proposed by Brodsky, Lepage and Mackenzie (1983) was 
followed. It is defined in terms of the heavy quark potential V(q), and is 
called the ‘V-scheme’. The strong coupling in the V-scheme is defined by 


4 4ray(q) 


16.151 
= (16.151) 


V(4) = 
with no higher-order corrections. 


The numerically calculated short-distance quantities Y (") are therefore to 
be expanded as the series 


yo) = y Maa” /a), (16.152) 
n=1 
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where el”) and d() are dimensionless constants independent of the lattice spac- 
ing a, but dependent on the particular Y), and ay(d)/a) is the running 
QCD coupling in the V-scheme, with N; = 3 light quark flavours. The per- 
turbative coefficients Cn for the various Y’s were computed using Feynman 
diagrams, for n < 3, for the same quark and gluon actions which were used to 
create the sets of gluon field configurations employed in the numerical evalua- 
tion of the Y’s. The renormalization scale d() /a varies for each short-distance 
quantity, being chosen according to the Lepage-Mackenzie (1993) prescription 
(or in some cases a more robust procedure due to Hornbostel, Lepage and 
Morningstar (2003)). 

There were 22 Y(7)'s, each of which was analyzed separately, fitting the 
expansion (16.152) to the 12 values of that Y calculated using the 12 gluon 
configurations. In the simplest terms, the result of each such fit would be the 
value of ay at a particular scale, which was chosen to be ay(7.5 GeV). The 
values required at the scales ay(d‘)/a;) were found by numerically integrat- 
ing the evolution equation (at four-loop order) for ay; here a; is the lattice 
spacing for each configuration (there were 6 different spacings). In fact, the 
fitting was more sophisticated, including further parameters related to vari- 
ous corrections; the interested reader can consult Davies et al. (2008) for the 
details. Having obtained ay(7.5 GeV), this was then converted to the MS 
scheme, using the relation (Brodsky, Lepage and Mackenzie 1983) 


ay (jt) = ayle în). (16.153) 


Finally, the resultant 0573 was evolved to MZ. The value (16.149) is the final 
result after performing a weighted average over the 22 separate determina- 
tions. A full discussion of the error estimate, which includes finite lattice 
spacing, finite lattice volume, and chiral extrapolation uncertainties, is given 
in Davies et al. (2008). 


16.5.4 Hadron masses 


For our last example of a precise lattice QCD calculation, it is appropriate to 
consider the mass spectrum of light hadrons. After all, protons and neutrons 
account for nearly all the mass of ordinary matter, and 95% of their mass is 
the result of QCD interactions. It has long been a fundamental challenge to 
predict hadron masses accurately from QCD. 

As one example of such calculations, we show in figure 16.12 the light 
hadron spectrum of QCD as reported by Diirr et al. (2008). Horizontal lines 
and bands are the experimental values (which have been isospin-averaged) 
with their decay widths. The solid circles are the predicted values. Vertical 
error bars represent combined statistical and systematic error estimates. The 
masses of the 7, K and = have no error bars, because they have been used to 
set the values of my = ma, ms and the overall scale, respectively. Once again, 
the agreement with experiment is very impressive. 
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FIGURE 16.12 
The light hadron spectrum of QCD, from Dürr et al. (2008). (See color plate 
IL.) 


These calculations used a Symanzik-improved gauge action (Liischer and 
Weisz 1985), and 2+1 flavours of light dynamical Wilson fermions, with var- 
ious improvements (Morningstar and Peardon 2004). The physical scale was 
set either by fitting to the mass of the ©, or to the mass of the Q; the two 
ways gave consistent results. Pion masses in the range (approximately) 800 
MeV to 190 MeV were used to extrapolate to the physical value, with lat- 
tice sizes approximately four times the inverse pion mass. A particular type 
of finite-volume effect arises in the case of strongly decaying resonant states: 
a procedure for reconstructing the infinite-volume resonance mass, given by 
Lüscher (1986, 1991a, 1991b), was followed here. This was satisfactory, ex- 
cept for the p and A at the lightest pion mass point, which was omitted from 
the extrapolation for these two channels. For further details, and additional 
references, we refer the reader to the supplementary material to Durr et al. 
(2008) provided online. 

We have been able to give only a brief introduction into what is now, almost 
forty years after its initial inception by Wilson (1974), the highly mature field 
of lattice QCD. A great deal of effort has gone into ingenious and subtle 
improvements to the lattice action, to the numerical algorithms, and to the 
treatment of fermions — to name a few of the issues. Lattice QCD is now 
a major part of particle physics. From the perspective of this chapter and 
the previous one, we can confidently say that, both in the short-distance 
(perturbative) regime, and in the long-distance (non-perturbative) regime, 
QCD is established as the correct theory of the strong interactions of quarks, 
beyond reasonable doubt. 
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Problems 

16.1 Verify equation (16.9). 

16.2 Verify equation (16.10). 

16.3 Show that the momentum space version of (16.18) is (16.19). 

16.4 Use (16.31) in (16.33) to verify (16.34). 

16.5 Verify (16.68) and (16.70). 


16.6 In a modified one-dimensional Ising model, spin variables s, at sites 
labelled by n = 1,2,3,...N take the values sn = +1, and the energy of each 


spin configuration is 
N-1 
= > In8n8n41 ’ 
n=1 


where all the constants Jn are positive. Show that the partition function Za 
is given by 


N- 
ÎI (2 cosh Kn) 


where K, = J,/kgT. Hence calculate the entropy for the particular case in 
which all the Jn’s are equal to J and N > 1, and discuss the behaviour of 
the entropy in the limits T —> oo and T > 0. 

Let ‘p’ denote a particular site such that 1 << p < N. Show that the 
average value (spSp+1) of the product spsp+1 is given by 


1 OZN 
(SpSp+1) = Zn OK, 


Show further that 


1 ZN 
(SpSp+5) = TADEO 0 Rp: 
Hence show that in the case Jı = J2 =... = Jy = J, 
(SpSp4j) = or 
where 


€ = —a/(In(tanh K)] , 


and K = J/kgT. Discuss the physical meaning of £, considering the T > oo 
and T — 0 limits explicitly. 
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Spontaneously Broken Global Symmetry 


Previous chapters have introduced the non-Abelian symmetries SU(2) and 
SU(3) in both global and local forms, and we have seen how they may be 
applied to describe such typical physical phenomena as particle multiplets, 
and massless gauge fields. Remarkably enough, however, these symmetries 
are also applied, in the Standard Model, in two cases where the physical 
phenomena appear to be very different. Consider the following two questions: 
(i) Why are there no signs in the baryonic spectrum, such as parity doublets 
in particular, of the global chiral symmetry introduced in section 12.3.2? (ii) 
How can weak interactions be described by a local non-Abelian gauge theory 
when we know the mediating gauge field quanta are not massless? The answers 
to these questions each involve the same fundamental idea, which is a crucial 
component of the Standard Model, and perhaps also of theories which go 
beyond it. This is the idea that a symmetry can be ‘spontaneously broken’, 
or ‘hidden’. By contrast, the symmetries considered hitherto may be termed 
‘manifest symmetries’. 

The physical consequences of spontaneous symmetry breaking turn out to 
be rather different in the global and local cases. However, the essentials for 
a theoretical understanding of the phenomenon are contained in the simpler 
global case, which we consider in this chapter. The application to sponta- 
neously broken chiral symmetry will be treated in chapter 18, and sponta- 
neously broken local symmetry will be discussed in chapter 19, and applied in 
chapter 22. 


ra a 


17.1 Introduction 


We begin by considering, in response to question (i) above, what could go 
wrong with the argument for symmetry multiplets that we gave in chapter 12. 
To understand this, we must use the field theory formulation of section 12.3, 
in which the generators of the symmetry are Hermitian field operators, and 
the states are created by operators acting on the vacuum. Thus consider two 
states |A), |B)! : 

|A) = 6410), |B) = 45,10) (17.1) 


1We now revert to the ordinary notation |0) for the vacuum state, rather than |Q}, but 
it must be borne in mind that |0) is the full (interacting) vacuum. 
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where Și and $b are related to each other by (cf (12.100)) 
[2,64] = oh (17.2) 
for some generator Q of a symmetry group, such that 
[Ó, H] =0. (17.3) 
(17.2) is equivalent to 
UȘIU- = gi, + ied, (17.4) 


for an infinitesimal transformation U ~ 1+ ieQ. Thus Și, is ‘rotated’ into $b 
by U, and the operators will create states related by the symmetry transfor- 
mation. We want to see what are the assumptions necessary to prove that 


Ea = Ep, where H\|A)=E,|A) and Ĥ|B)= Ep|B). (17.5) 
We have i W oe 
Ep|B) = H|B) = A$$ lo) = H(Q¢', — 61,Q)10). (17.6) 


Now if 
Q|0) =0 (17.7) 


we can rewrite the right-hand side of (17.6) as 


AGO = QH¢'\|0) using (17.3) = QH\A) = E4Q|A) 
= E,Q¢',|0) = Ealt + 64,Q)|0) using (17.2) 
E,4|B) if (17.7) holds; (17.8) 


whence, comparing (17.8) with (17.6), we see that 
Er=Eg _ if (17.7) holds. (17.9) 
Remembering that U = exp(iaQ), we see that (17.7) is equivalent to 
Jo” = U0) = |0}. (17.10) 


Thus a multiplet structure will emerge provided that the vacuum is left in- 
variant under the symmetry transformation. The ‘spontaneously broken sym- 
metry’ situation arises in the contrary case — that is, when the vacuum is not 
invariant under the symmetry, which is to say when 


Q|0) £0. (17.11) 


In this case, the argument for the existence of symmetry multiplets breaks 
down, and although the Hamiltonian or Lagrangian may exhibit a non-Abelian 
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symmetry, this will not be manifested in the form of multiplets of mass- 
degenerate particles. 

The preceding italicized sentence does correctly define what is meant by 
a spontaneously broken symmetry in field theory, but there is another way of 
thinking about it which is somewhat less abstract though also less rigorous. 
The basic condition is Q|0) Z 0, and it seems tempting to infer that, in this 
case, the application of Q to the vacuum gives, not zero, but another possible 
vacuum, |0Y. Thus we have the physically suggestive idea of ‘degenerate 
vacua’ (they must be degenerate since [Q, H] = 0). We shall see in a moment 
why this notion, though intuitively helpful, is not rigorous. 

It would seem, in any case, that the properties of the vacuum are all- 
important, so we begin our discussion with a somewhat formal, but nonethe- 
less fundamental, theorem about the quantum field vacuum. 


E a 


17.2 The Fabri—Picasso theorem 


Suppose that a given Lagrangian £ is invariant under some one- parameter 
continuous global internal symmetry with a conserved Noether current Je 
such that O, y" = = 0. The associated ‘charge’ is the Hermitian operator Q = 


fj zx, and Q = 0. We have hitherto assumed that the transformations of 
such a U(1) group are representable in the space of physical states by unitary 
operations U (A) = exp iAQ for arbitrary A, with the vacuum invariant under 
Ú, so that Q|0) = 0. Fabri and Picasso (1966) showed that there are actually 
two possibilities: 


(i) Q|0) = 0, and |0} is an eigenstate of Q with eigenvalue 0, so that 
|0} is invariant under U (i.e. U|O) = |0)); 
or 


(ii) Q|0) does not exist in the space (its norm is infinite). 


The statement (ii) is technically more correct than the more intuitive state- 
ments ‘Q|0) 4 0’ or ‘U|0) = |0Y”, suggested above. 

To prove this result, consider the vacuum matrix element (0|j°(x)Q|0). 
From translation invariance, implemented by the unitary operator? U (x) = 
expiP - x (where PP is the 4-momentum operator) we obtain 


ollo) = (Oje?*7?O)e“” *Q0) 
= (Ole? 23 i0(0)0e-:2:2]0) 
21f this seems unfamiliar, it may be regarded as the 4-dimensional generalization of the 


transformation (1.7) in appendix I of volume 1, from Schrédinger picture operators at t = 0 
to Heisenberg operators at t 4 0. 
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where the second line follows from 
[P#,Q] =0 (17.12) 


since Q is an internal symmetry. But the vacuum is an eigenstate of P* with 
eigenvalue zero, and so 


(019°(x)Q|0) = (015° (0)Â]0) (17.13) 


which states that the matrix element we started from is in fact independent 
of z. Now consider the norm of Q|0): 


(0|QQI0) 


II 


Jeria, (17.14) 
= J čz oo, (17.15) 


which must diverge in the infinite volume limit, unless Q|0) = 0. Thus either 
Q|0) = 0 or Q|0) has infinite norm. The foregoing can be easily generalized 
to non-Abelian symmetry operators Î,. 

Remarkably enough, the argument can also, in a sense, be reversed. Cole- 
man (1966) proved that if an operator 


Q(t) = Jo) (17.16) 


is the spatial integral of the u = 0 component of a 4-vector (but not assumed 
to be conserved), and if it annihilates the vacuum 


Q(t)|0) = 0, (17.17) 


then in fact Oj” =0, Q is independent of t, and the symmetry is unitarily 
implementable by operators U = exp(iAQ). 

We might now simply proceed to the chiral symmetry application. We 
believe, however, that the concept of spontaneous symmetry breaking is so 
important to particle physics that a more extended discussion is amply justi- 
fied. In particular, there are crucial insights to be gained by considering the 
analogous phenomenon in condensed matter physics. After a brief look at the 
ferromagnet, we shall describe the Bogoliubov model for the ground state of a 
superfluid, which provides an important physical example of a spontaneously 
broken global Abelian U(1) symmetry. We shall see that the excitations away 
from the ground state are massless modes and we shall learn, via Goldstone’s 
theorem, that such modes are an inevitable result of spontaneously breaking a 
global symmetry. Next, we shall introduce the ‘Goldstone model’ which is the 
simplest example of a spontaneously broken global U(1) symmetry, involving 
just one complex scalar field. The generalization of this to the non-Abelian 
case will draw us in the direction of the Higgs sector of the Standard Model. 
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Returning to condensed matter systems, we introduce the BCS ground state 
for a superconductor, in a way which builds on the Bogoliubov model of a 
superfluid. We are then prepared for the application, in chapter 18, to spon- 
taneous chiral symmetry breaking (question (i) above), following Nambu’s 
profound analogy with one aspect of superconductivity. In chapter 19 we 
shall see how a different aspect of superconductivity provides a model for the 
answer to question (ii) above. 


E a 


17.3 Spontaneously broken symmetry in condensed 
matter physics 


17.3.1 The ferromagnet 


We have seen that everything depends on the properties of the vacuum state. 
An essential aid to understanding hidden symmetry in quantum field theory 
is provided by Nambu’s (1960) remarkable insight that the vacuum state of a 
quantum field theory is analogous to the ground state of an interacting many- 
body system. It is the state of lowest energy — the equilibrium state, given 
the kinetic and potential energies as specified in the Hamiltonian. Now the 
ground state of a complicated system (for example, one involving interacting 
fields) may well have unsuspected properties — which may, indeed, be very 
hard to predict from the Hamiltonian. But we can postulate (even if we 
cannot yet prove) properties of the quantum field theory vacuum |0) which 
are analogous to those of the ground states of many physically interesting 
many-body systems — such as superfluids and superconductors, to name two 
with which we shall be principally concerned. 

Now it is generally the case, in quantum mechanics, that the ground state 
of any system described by a Hamiltonian is non-degenerate. Sometimes we 
may meet systems in which apparently more than one state has the same 
lowest energy eigenvalue. Yet in fact none of these states will be the true 
ground state: tunnelling will take place between the various degenerate states, 
and the true ground state will turn out to be a unique linear superposition 
of them. This is, in fact, the only possibility for systems of finite spatial 
extent, though in practice a state which is not the true ground state may 
have an extremely long lifetime. However, in the case of fields (extending 
presumably throughout all space), the Fabri-Picasso theorem shows that there 
is an alternative possibility, which is often described as involving a ‘degenerate 
ground state’ — a term we shall now elucidate. In case (a) of the theorem, the 
ground state is unique. For, suppose that several ground states |0, a), |0,),... 
existed, with the symmetry unitarily implemented. Then one ground state will 
be related to another by 


0, a) = e2\0, b) (17.18) 
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for some A. However, in case (a) the charge annihilates a ground state, and 
so all of them are really identical. In case (b), on the other hand, we cannot 
write (17.18) — since Q|0) does not exist — and we do have the possibility 
of many degenerate ground states. In simple models one can verify that 
these alternative ground states are all orthogonal to each other, in the infinite 
volume limit — or perhaps more physically, the limit in which the number 
of degrees of freedom becomes infinite. And each member of every “tower” 
of excited states, built on these alternative ground states, is also orthogonal 
to all the members of other towers. But any single tower must constitute a 
complete space of states. It follows that states in different towers belong to 
different complete spaces of states, that is to different — and inequivalent — 
“worlds”, each one built on one of the possible orthogonal ground states. 

At first sight, a familiar example of these ideas seems to be that of a fer- 
romagnet, below its Curie temperature Tc. Consider an “ideal Heisenberg 
ferromagnet’ with N atoms each of spin 1 i 2, described by a Hamiltonian of 
Heisenberg exchange form Hg = -J Y S; S;, where i and j label the atomic 
sites. This Hamiltonian is invariant under spatial rotations, since it only 
depends on the dot product of the spin operators. Such rotations are im- 
plemented by unitary operators exp(iS : a) where S= >, Si, and spins at 
different sites are assumed to commute. As usual with angular momentum 
in quantum mechanics, the eigenstates of Hs are labelled by the eigenvalues 
of total squared spin, and of one component of spin, say of 5, = Ee Six: 
The quantum mechanical ground state of Hg is an eigenstate with total spin 
quantum number S = N/2, and this state is (2. N/2+ 1) = (N + 1)-— fold 
degenerate, allowing for all the possible eigenvalues (N/2, N/2—1,...— N/2) 
of 5, for this value of S. We are free to choose any one of these degenerate 
states as ‘the’ ground state, say the state with eigenvalue S, = N/2. 

It is clear that the ground state is not invariant under the spin-rotation 
symmetry of Hs, which would require the eigenvalues S = S, = 0. Further- 
more, this ground state is degenerate. So two important features of what 
we have so far learned to expect of a spontaneously broken symmetry are 
present — namely, ‘the ground state is not invariant under the symmetry of 
the Hamiltonian’, and ‘the ground state is degenerate’. However, it has to 
be emphasized that this ferromagnetic ground state does, in fact, respect the 
symmetry of Hs, in the sense that it belongs to an irreducible representation 
of the symmetry group: the unusual feature is that it is not the ‘trivial’ (sin- 
glet) representation, as would be the case for an invariant ground state. The 
spontaneous symmetry breaking which is the true model for particle physics 
is that in which a many body ground state is not an eigenstate (trivial or 
otherwise) of the symmetry operators of the Hamiltonian: rather it is a su- 
perposition of such eigenstates. We shall explore this for the superfluid and 
the superconductor in due course. 

Nevertheless, there are some useful insights to be gained from the ferro- 
magnet. First, consider two ground states differing by a spin rotation. In the 
first, the spins are all aligned along the 3-axis, say, and in the second along 
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the axis ñ = (0,sina,cosa). Thus the first ground state is 


ts oe (N products) (17.19) 


while the second is (cf (4.31), (4.32)) 


(a) _ ( cosa/2 cos a/2 
Xo = ( isina/2 za ( isina/2 ) y," (7420) 


The scalar product of (17.19) and (17.20) is (cosa/2)Y, which goes to zero as 
N — oo. Thus any two such ‘rotated ground states’ are indeed orthogonal in 
the infinite volume (or infinite number of degrees of freedom) limit. 

We may also enquire about the excited states built on one such ground 
state, say the one with 5, eigenvalue N/2. Suppose for simplicity that 
the magnet is one-dimensional (but the spins have all three components). 
Consider the state Xn = = Sy Xo where Sp is the spin lowering operator 
Sn- = (Sas — iS ny) at site n, such that 


( : ) - ( i E (17.21) 


so Sn—Xo differs from the ground state xo by having the spin at site n flipped. 
The action of Hs on Xn can be found by writing 


A a i Cees A A x A A 
5 S;- S; = 5 9 Si-Si+ + Sj-Si+) + Diz (17.22) 
izi izi 
(remembering that spins on different sites commute), where Sia = oe + 


iS;,. Since all S;; operators give zero on a spin ‘up’ state, the only non-zero 
contributions from the first (bracketed) term in (17.22) come from terms in 
which either S, i+ OF Si j+ act on the ‘down’ spin at n, so as to restore it to “up”. 

The ‘partner’ operator $;- (or Si ) then simply lowers the spin at i (or j), 
leading to the result 


tu A poa 
> zi- S+ + Sj—Si+)Xn = > Xi- (17.23) 
iz iZn 


Thus the state Xn is not an eigenstate of Hg. However, a little more work 
shows that the superpostitions 


Xq = = 2 gina, (17.24) 


are eigenstates; here q is one of the discretized wavenumbers produced by 
appropriate boundary conditions, as is usual in one-dimensional ‘chain’ prob- 
lems. The states (17.24) represent spin waves, and they have the important 
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feature that for low q (long wavelength) their frequency w tends to zero with 
q (actually w œ q?). In this respect, therefore, they behave like massless par- 
ticles when quantized — and this is another feature we should expect when a 
symmetry is spontaneously broken. 

The ferromagnet gives us one more useful insight. We have been assuming 
that one particular ground state (e.g. the one with S, = N/2) has been some- 
how ‘chosen’. But what does the choosing? The answer to this is clear enough 
in the (perfectly realistic) case in which the Hamiltonian Hg is supplemented 
by a term —guB), Seay representing the effect of an applied field B directed 
along the z-axis. This term will indeed ensure that the ground state is unique, 
and has $ = N/2. Consider now the two limits B > 0 and N > oo, both 
at finite temperature. When B > 0 at finite N, the N +1 different S, eigen- 
states become degenerate, and we have an ensemble in which each enters with 
an equal weight; there is therefore no loss of symmetry, even as N > co (but 
only after B > 0). On the other hand, if N > œ at finite B 4 0, the single 
state with S, = N/2 will be selected out as the unique ground state and this 
asymmetric situation will persist even in the limit B > 0. In a (classical) 
mean field theory approximation we suppose that an ‘internal field’ is ‘spon- 
taneously generated’, which is aligned with the external B and survives even 
as B —> 0, thus ‘spontaneously’ breaking the symmetry. 

The ferromagnet therefore provides an easily pictured system exhibiting 
many of the features associated with spontaneous symmetry breaking; most 
importantly, it strongly suggests that what is really characteristic about the 
phenonenon is that it entails ‘spontaneous ordering'.? Generally such ordering 
occurs below some characteristic “critical temperature’, Tc. The field which 
develops a non-zero equilibrium value below Tc is called an ‘order parame- 
ter’. This concept forms the basis of Landau’s theory of second-order phase 
transitions (see for example chapter XIV of Landau and Lifshitz 1980). 

We now turn to an example much more closely analogous to the particle 
physics applications: the superfluid. 


17.3.2 The Bogoliubov superfluid 
Consider the non-relativistic Hamiltonian (in the Schródinger picture) 


= gn [0298 v6 
thf [ëzdyue- ón) (1725) 


where Şi (a) creates a boson of mass m at position æ. This H describes 
identical bosons interacting via a potential v, which is assumed to be weak 
(see, for example, Schiff 1968 section 55, or Parry 1973 chapter 1). We note 


3It is worth pausing to reflect on the idea that ordering is associated with symmetry 
breaking. 
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at once that H is invariant under the global U(1) symmetry 


d(x) + $ (æ) = e'*6(a), (17.26) 


the generator being the conserved number operator 
N= [as Pa (17.27) 


which obeys [N,H] = 0. Our ultimate concern will be with the way this 
symmetry is ‘spontaneously broken’ in the superfluid ground state. Naturally, 
since this is an Abelian, rather than a non-Abelian, symmetry the physics will 
not involve any (hidden) multiplet structure. But the nature of the ‘symmetry 
breaking ground state’ in this U(1) case (and in the BCS model of section 17.7) 
will serve as a physical model for non-Abelian cases also. 

We begin by re-writing H in terms of mode creation and annihilation 
operators in the usual way. We expand ¢(a) as a superposition of solutions of 
the v = 0 problem, which are plane waves quantized in a large cube of volume 
Q: 


(x) = 5 2 âp ET (17.28) 


where az ]0) = 0, â} 10) is a one-particle state, and âk âl] = Op pi, With 
all other commutators vanishing. We impose periodic boundary conditions 


at the cube faces, and the free particle energies are ez = k? /2m. Inserting 
(17.28) into (17.25) leads (problem 17.1) to 


A 5 1 à ye Y te 2, 
A =X endl az + 50 Y o(lk1 — ki |â} a, Ope yy Alka + ko — k! — kh) 
k A 


(17.29) 
where the sum is over all momenta k1, k2, k1, ky subject to the conservation 
law imposed by the A function: 

A(k) = 1 ifk=0 (17.30) 
0 if k # 0. (17.31) 
The interaction term in (17.29) is easily visualized as in figure 17.1. A pair 


of particles in states k1, kh is scattered (conserving momentum) to a pair in 
states kı, kə via the Fourier transform of v: 


sk) = J o(rje tk? Bp, (17.32) 


Now, below the superfluid transition temperature Ts, we know that in the 
limit as v — 0 the ground state has all the particles ‘condensed’ into the lowest 
energy state, which has k = 0. Thus the ground state will be proportional to 


IN, 0) = (a) Jo). (17.33) 
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FIGURE 17.1 
The interaction term in (17.29). 


When a weak repulsive v is included, it is reasonable to hope that most of 
the particles remain in the condensate, only relatively few being excited to 
states with k 4 0. Let No be the number of particles with k = 0, where by 
assumption No ~ N. We now consider the limit N (and No) > oo and Q — oo 
such that the density p = N/Q (and po = No/Q) stays constant. Bogoliubov 
(1947) argued that in this limit we may effectively replace both @ and âl in 
the second term in (17.29) by the number No /2 This amounts to saying that 
in the commutator i 3 eh i 
do 4 a à 

TATA PTAA (17.34) 
the two terms on the left-hand side are each of order No/Q and hence finite, 
while their difference may be neglected as Q — oo. Replacing âo and âl by 


No 2 leads (problem 17.2) to the following approximate form for H: 


A A eee 1N? 
H x Hg = 5 aq 0 Er + 20 v(0) 

k 

1 IN atat Yo sag 
Pa a Ulm) apa! y + 44 4_ pe]: (17.35) 
k 
where 
N 
Er = €k + galkl) (17.36) 


primed summations do not include k = 0, and terms which tend to zero as 
Q — oo have been dropped (thus, No has been replaced by N). 

The most immediately striking feature of (17.35), as compared with H of 
(17.29), is that Hp does not conserve the U(1) (number) symmetry (17.26) 
while H does: it is easy to see that for (17.26) to be a good symmetry, the 
number of â's must equal the number of á?”s in every term. Thus the ground 
state of Hp, |ground)p, cannot be expected to be an eigenstate of the number 
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operator. However, it is important to be clear that the number non-conserving 
aspect of (17.35) is of a completely different kind, conceptually, from that 
which would be associated with a (hypothetical) “explicit” number violating 
term in the original Hamiltonian — for example, the addition of a term of the 
form “ataú”. In arriving at (17.35), we effectively replaced (17.28) by 


A 1 he 
p(x) = pil? + ain soe (17.37) 
kz0 


where po = No/Q, No ~ N, and No/Q remains finite as Q > 00. The limit is 
crucial here: it enables us to picture the condensate Ny as providing an infinite 
reservoir of particles, with which excitations away from the ground state can 
exchange particle number. From this point of view, a number non-conserving 
ground state may appear more reasonable. The ultimate test, of course, is 
whether such a state is a good approximation to the true ground state, for a 
large but finite system. 

What is |ground)g? Remarkably, Hp can be exactly diagonalized by means 
of the Bogoliubov quasiparticle operators (for k 4 0) 


Ap = fap + grâ! y, ay, = fray, + grâ! y, (17.38) 


where fx and g are real functions of k = |k|. We must again at once draw 
attention to the fact that this transformation does not respect the symmetry 
(17.26) either, since Gp, — e “az while aly. > pa. In fact, the op- 


AT 
erators Q k 


will turn out to be precisely creation operators for quasiparticles 
which exchange particle number with the ground state. 


The commutator of â% and âl, is easily evaluated: 


âp âl] = fe — 9%, (17.39) 


while two @’s or two ât’s commute. We choose fi and gx such that f2—g%? = 1, 
so that the â's and the â's have the same (bosonic) commutation relations, 
and the transformation (17.38) is ‘canonical’. A convenient choice is fi = 
cosh 07, gk = sinh 0k. We now assert that Hp can be written in the form 


Ap => wd) dy, + B (17.40) 
k 


for certain wz and 8. Equation (17.40) implies, of course, that the eigenvalues 
of Hp are B+ p(n + 1/2)wx, and that âl acts as the creation operator for 
the quasiparticle of energy wz, as just anticipated. 

We verify (17.40) slightly indirectly. We note first that it implies that 


[Hp, 47] = 016). (17.41) 
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Substituting for a from (17.38), we require 


[Hp, cosh 0; 4] + sinh 0, 4_¿] = wi (cosh 6; a; + sinh 0, â_7), (17.42) 


l 
which must hold as an identity in the â's and âi's. Using the expression (17.35) 
for Hp, and some patient work with the commutation relations (problem 17.3), 
one finds 


N 
(co, — Ej) cosh@; + gal) sinh 6, = 0 (17.43) 
N 
q ll) coshð, — (eu + Ei) sinh & = 0. (17.44) 


For consistency, therefore, we require 


El —w? — (=) (a(|L|))? = 0, (17.45) 


or (recalling the definitions of E; and e€;) 


w= | a (5 + TO (17.46) 


Im (2m 


where p = N/Q. The value of tanh 6, is then determined via either of (17.43), 
(17.44). 

Equation (17.46) is an important result, giving the frequency as a function 
of the momentum (or wavenumber); it is an example of a ‘dispersion relation’. 
At the risk of stating the obvious, let us emphasize that equation (17.40) tells 
us that the original system of interacting bosons is equivalent (under the 
approximations made) to a system of non-interacting quasiparticles, whose 
frequency u is related to wavenumber by (17.46). These are the true modes 
of the system. Let us consider this dispersion relation. 

First of all, in the non-interacting case 0 = 0, we recover the usual 
frequency-wavenumber relation for a massive non-relativistic particle, w; = 
1?/2m. But if 5(0) 4 0, the behaviour at small I is very different: w = csll|, 
where cs = (pu(0)/ m) ?. This dispersion relation is characteristic of a mass- 
less mode, but in this case it is sound rather than light, with speed of sound 
Cs. The spectrum is therefore phonon-like, not (non-relativistic) particle-like. 
The two behaviours can be easily distinguished experimentally, by measuring 
the low-temperature specific heat: in three dimensions, for w; ~ 12 it goes 
to zero as T°/?, whereas for ww, ~ |I| it goes as T°. The latter behaviour 
is observed in superfluids. At large values of |l|, however, w; behaves essen- 
tially like 1? /2m and the spectrum returns to the ‘particle-like’ one of massive 
bosons. Thus (17.46) interpolates between phonon-like behaviour at small |I| 
and particle-like behaviour at large |l]. 
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There is still more to be learned from (17.46). If, in fact, (|l) ~ 1/%, 
then w, — constant as |l| > 0, and the spectrum would not be phonon-like. 
Indeed, if ú([1|) ~ e2/12, then ui ~ lel(p/m)*/? for small |l], which is just the 
“plasma frequency’ wp. In particle physics terms, this would be analogous to 
a dispersion relation of the form w; ~ (ws + 12)1/2, which describes a particle 
with mass wp. Such a v is, of course, Colombic (the Fourier transform of 
e2/|z|), indicating that in the case of such a long-range force the frequency 
spectrum acquires a mass-gap. This will be the topic of chapter 19. 

Having discussed the spectrum of quasiparticle excitations, let us now 
concentrate on the ground state in this model. From (17.40), it is clear that 
it is defined as the state |ground)p such that 


Gp leround)a = 0 for all k 4 0; (17.47) 


i.e. as the state with no non-zero-momentum quasiparticles in it. This is a 
complicated state in terms of the original aj, and ay, operators, but we can 
give a formal expression for it, as follows. Since the 4's and @’s are related by 
a canonical transformation, there must exist a unitary operator Up such that 


âk = UpagUj", ap = Up 'âpUB. (17.48) 
Now we know that 47,10) = 0. Hence it follows that 
âp,UB0) = 0, (17.49) 


and we can identify |ground)g with Ûp|0}. In problem 17.4, Ug is evaluated for 
an Hg consisting of a single k-mode only, in which case the operator effecting 
the transformation analogous to (17.48) is Uy = exp[9(aá — atat)/2] where 
6 replaces 6; in this case. This generalizes (in the form of products of such 
operators) to the full Hp case, but we shall not need the detailed result; an 
analogous result for the BCS ground state is discussed more fully in section 
17.7. The important point is the following. It is clear from expanding the 
exponentials that Up creates a state in which the number of a-quanta (i.e. the 
original bosons) is not fixed. Thus unlike the simple non-interacting ground 
state |.N,0) of (17.33), |ground)g = Up|0) does not have a fixed number of 
particles in it: that is to say, it is not an eigenstate of the symmetry operator 
N, as anticipated in the comment following (17.36). This is just the situation 
alluded to in the paragraph before equation (17.19), in our discussion of the 
ferromagnet. 

Consider now the expectation value of (a) in any state of definite particle 
number — that is, in an eigenstate of the symmetry operator N. It is easy to 
see that this must vanish (remember that ¢ destroys a boson, and so ¿|N) is 
proportional to |N — 1), which is orthogonal to |N)). On the other hand, this 
is not true of p(x): for example, in the non-interacting ground state (17.33), 
we have N 

(N, OlĝB(2)|N,0) = po”. (17.50) 
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Furthermore, using the inverse of (17.38) 
Gp, = cosh 0,45, — sinh Oră! y, (17.51) 
together with (17.47), we find the similar result: 
3(ground|$p(a)|ground)p = p4. (17.52) 


The question is now how to generalize (17.50) or (17.52) to the complete ¢(a) 
and the true ground state |ground), in the limit N,Q > oo with fixed N/Q. 
We make the assumption that 


(ground|¢(a)|ground) 7 0; (17.53) 


that is, we abstract from the Bogoliubov model the crucial feature that the 
field acquires a non-zero expectation value in the ground state, in the infinite 
volume limit. 

We are now at the heart of spontaneous symmetry breaking in field theory. 
Condition (17.53) has the form of an ‘ordering’ condition: it is analogous to 
the non-zero value of the total spin in the ferromagnetic case, but in (17.53) 
— we must again emphasize — |ground) is not an eigenstate of the symmetry 
operator N; if it were, (17.53) would vanish, as we have just seen. Recall- 
ing the association ‘quantum vacuum +> many body ground state’ we expect 
that the occurrence of a non-zero vacuum expectation value (vev) for an op- 
erator transforming non-trivially under a symmetry operator will be the key 
requirement for spontaneous symmetry breaking in field theory. Such opera- 
tors are generically called order parameters. In the next section we show how 
this requirement necessitates one (or more) massless modes, via Goldstone’s 
theorem (1961). 

Before leaving the superfluid, we examine (17.37) and (17.52) in another 
way, which is only rigorous for a finite system but is nevertheless very sugges- 
tive. Since the original H has a U(1) symmetry under which ¢ transforms to 
$ = exp(—ia)é¢, we should be at liberty to replace (17.37) by 


i —ia 1 A —ia ik: 
dig =e pal? + qa Y ape Me ka, (17.54) 
kz0 


But in that case our condition (17.52) becomes 
p(ground|¢,|ground)g = e~'“g(ground|ép|ground) sg. (17.55) 
Now ¢ = U,dUz! where Uy = exp(iaN). Hence (17.55) may be written as 
p(ground|U.¢U,, ground)g = e!g(ground|¢p|ground) g. (17.56) 


If |ground)g were an eigenstate of N with eigenvalue N, say, then the Ú,, fac- 


tors in (17.56) would become just ei“ -e~i@ and would cancel out, leaving a 
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contradiction. Instead, however, knowing that |ground)p is not an eigen- 
state of N, we can regard UZ'|ground)p as an ‘alternative ground state’ 
lground, a)p such that 


(ground, a|g|ground, a) = e”i“p(ground|âp|ground)p, (17.57) 


the original choice (17.52) corresponding to a = 0. There are infinitely many 
such ground states since a is a continuous parameter. No physical consequence 
follows from choosing one rather than another, but we do have to choose one, 
thus ‘spontaneously’ breaking the symmetry. In choosing say a = 0, we are 
deciding (arbitrarily) to pick the ground state such that p(ground|¢|ground) g 
is aligned in the ‘real’ direction. By hypothesis, a similar situation obtains 
for the true ground state. None of the states |ground, a) is an eigenstate for 
N: instead, they are certain coherent superpositions of states with different 
eigenvalues NV, such that the expectation value of $ has a definite phase. 


E: SeSe 


17.4 Goldstone’s theorem 


We return to quantum field theory proper, and show following Goldstone 
(1961) (see also Goldstone, Salam and Weinberg 1962) how in case (b) of the 
Fabri-Picasso theorem massless particles will necessarily be present. Whether 
these particles will actually be observable depends, however, on whether the 
theory also contains gauge fields. In this chapter we are concerned solely with 
global symmetries, and gauge fields are absent; the local symmetry case is 
treated in chapter 19. 

Suppose, then, that we have a Lagrangian £ with a continuous symmetry 
generated by a charge Q, which is independent of time, and is the space 
integral of the y = 0 component of a conserved Noether current: 


Q= foto) dix. (17.58) 


We consider the case in which the vacuum of this theory is not invariant, i.e. 
is not annihilated by Q. 

Suppose by) is some field operator which is not invariant under the con- 
tinuous symmetry in question, and consider the vacuum expectation value 


(0110, ó(y)110). (17.59) 


Just as in equation (17.13), translation invariance implies that this vev is, in 
fact, independent of y, and we may set y = 0. If Q were to annihilate |0), the 
expression (17.18) would clearly vanish: we investigate the consequences of it 
not vanishing. Since ĝi is not invariant under Q, the commutator in (17.59) will 
give some other field, call it 4’(y); thus the hallmark of the hidden symmetry 
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situation is the existence of some field (here ¢!(y)) with non-vanishing vacuum 
expectation value, just as in (17.53). 
From (17.58), we can write (17.59) as 


0 4 (016'(y)/0) (17.60) 
= (of ajo) Stylo) (17.61) 
Since, by assumption, Oj" = 0, we have as usual 
Da e ajol( + | čaj) = (17.62) 
whence 
0 A A A A 
25 | Cello.) = = f eoii) dilo) (17:63 


Sos J dS - (0[G(x),d(y)]|0). (17.64) 


If the surface integral vanishes in (17.64), (17.61) will be independent of zo. 
The commutator in (17.64) involves local operators separated by a very large 
space-like interval, and therefore the vanishing of (17.64) would seem to be 
unproblematic. Indeed so it is — with the exception of the case in which the 
symmetry is local and gauge fields are present. A detailed analysis of exactly 
how this changes the argument being presented here will take us too far afield 
at this point, and the reader is referred to Guralnik et al. (1968) and Bernstein 
(1974). We shall treat the ‘spontaneously broken’ gauge theory case in chapter 
19, but in less formal terms. 

Let us now see how the independence of (17.61) on zo leads to the necessity 
for a massless particle in the spectrum. Inserting a complete set of states in 
(17.61), we obtain 


0 A | Ex OA alto) — (ol) m) lot) ]0)} (17.65) 


| a OO (altu) Joye — (016) (loto) loci) 
(17.66) 


using translation invariance, with p, the 4-momentum eigenvalue of the state 
|n). Performing the spatial integral on the right-hand side we find (omitting 
the irrelevant (27)?) 


0 4 JO EPa OO n) (nóty) — (0|6(y) In) (n|0(0)|0)e trozo], 


(17.67) 
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But this expression is independent of zo. Massive states |n) will produce 
explicit zo-dependent factors e+iMnzo ( 


Pno > Mn as the 6-function constrains 
Pn, = 0), hence the matrix elements of jo between |0) and such a massive state 
must vanish, and such states contribute zero to (17.67). Equally, if we take 
In) = |0), (17.67) vanishes identically. But it has been assumed to be not zero. 
Hence some state or states must exist among |n) such that (O|jo|n) 4 0 and 
yet (17.67) is independent of xo. The only possibility is states whose energy 
Pno goes to zero as their 3-momentum does (from 63(p,,)). Such states are, 
of course, massless; they are called generically Goldstone modes. Thus the 
existence of a non-vanishing vacuum expectation value for a field, in a theory 
with a continuous symmetry, appears to lead inevitably to the necessity of 
having a massless particle, or particles, in the theory. This is the Goldstone 
result. 

The superfluid provided us with an explicit model exhibiting the crucial 
non-zero expectation value (ground|¢|ground) Æ 0, in which the now expected 
massless mode emerged dynamically. We now discuss a simpler, relativistic 
model, in which the symmetry breaking is brought about more ‘by hand’ — 
that is, by choosing a parameter in the Lagrangian appropriately. Although 
in a sense less ‘dynamical’ than the Bogoliubov superfluid (or the BCS su- 
perconductor, to be discussed shortly) this Goldstone model does provide a 
very simple example of the phenomenon of spontaneous symmetry breaking 
in field theory. 


17.5 Spontaneously broken global U(1) symmetry: the 
Goldstone model 


We consider, following Goldstone (1961), a complex scalar field $ as in sec- 
tion 7.1, with 


= ath — ida), Și = (hn + ida), (17.68) 


described by the Lagrangian 
Lo = (0,0')(0"4) — V (G). (17.69) 


We begin by considering the ‘normal’ case in which the potential has the form 
os Tock Satis 
V = Vs = ploi) + p'O (17.70) 
with 12, 1 > 0. The Hamiltonian density is then 


ig <b +V: V +V). (17.71) 
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Clearly La is invariant under the global U(1) symmetry 
b> g =e, (17.72) 


the generator being No of (7.23). We shall see how this symmetry may be 
‘spontaneously broken’. 

We know that everything depends on the nature of the ground state of 
this field system — that is, the vacuum of the quantum field theory. In gen- 
eral, it is a difficult, non-perturbative, problem to find the ground state (or a 
good approximation to it — witness the superfluid). But we can make some 
progress by first considering the theory classically. It is clear that the absolute 
minimum of the classical Hamiltonian Hg is reached for 


(i) ¢ = constant, which reduces the ġ and V¢ terms to zero; 


(ii) & = dy where dp is the minimum of the classical version of the 
potential, V. 


For V = Vs as in (17.70) but without the hats, and with A and u? both 
positive, the minimum of Vs is clearly at ¢ = 0, and is unique. In the quantum 
theory, we expect to treat small oscillations of the field about this minimum as 
approximately harmonic, leading to the usual quantized modes. To implement 
this, we expand db about the classical minimum at ¢ = 0, writing as usual 


a d3k 
= | ———[a(k)e** + di (hei? 17.73 
b= | azi) (ie) (17.73) 
where the plane waves are solutions of the ‘free’ (A = 0) problem. For A = 0 
the Lagrangian is simply 


hiss = dpi arg — wd d, (17.74) 


which represents a complex scalar field, consisting of two degrees of freedom, 
each with the same mass pu (see section 7.1). Thus in (17.73) w = (k? + 2)!/2, 
and the vacuum is defined by 


a(k)|0) = 6(k)|0) = 0, (17.75) 


and so clearly E 
(0|¢|0) = 0. (17.76) 


It seems reasonable to interpret quantum field average values as corresponding 
to classical field values, and on this interpretation (17.76) is consistent with 
the fact that the classical minimum energy configuration has ¢ = 0. 

Consider now the case in which the classical minimum is not at o = 0. 
This can be achieved by altering the sign of u? in (17.70) ‘by hand’, so that 
the classical potential is now the ‘symmetry breaking’ one 


V = V = TAGO — poto. (17.77) 
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FIGURE 17.2 
The classical potential Vsp of (17.77). 


This is sketched versus fi and > in figure 17.2. This time, although the 
origin fi = ¢2 = 0 is a stationary point, it is an (unstable) maximum rather 
than a minimum. The minimum of Vsp occurs when 


2 2 
(old) = E, (17.78) 
or alternatively when 
2 2 _ 4p? 2 
Pi + 03 = Tie =v (17.79) 
where ul 
u 


The condition (17.79) can also be written as 
l$l = v/v2. (17.81) 


To have a clearer picture, it is helpful to introduce the ‘polar’ variables p(x) 
and 0(x) via 


(a) = (p(2)/V2) explio(a)/v) (17.82) 


where for convenience the v is inserted so that 0 has the same dimension 
(mass) as p and ¢. The minimum condition (17.81) therefore represents the 
circle p = v; any point on this circle, at any value of 0, represents a possible 
classical ground state — and it is clear that they are (infinitely) degenerate. 
Before proceeding further, we briefly outline a condensed matter analogue 
of (17.77) and (17.81) which may help in understanding the change in sign of 
the parameter u?. Consider the free energy F of a ferromagnet as a function 
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of the magnetization M at temperature T, and make an expansion of the 
form 


F x Fo(T) + p? (T)M? + 4M! ho. (17.83) 


valid for weak and slowly varying magnetization. If the parameter u? is posi- 
tive, it is clear that F has a simple ‘bowl’ shape as a function of |M|, with a 
minimum at |M| = 0. This is the case for T greater than the ferromagnetic 
transition temperature Tc. However, if one assumes that u2(T) changes sign 
at Tc, becoming negative for T < Tc, then F will now resemble a vertical 
section of figure 17.2, the minimum being at |M| 4 0. Any direction of M 
is possible (only |M| is specified); but the system must choose one particular 
direction (e.g. via the influence of a very weak external field, as discussed in 
section 17.3.1), and when it does so the rotational invariance exhibited by F 
of (17.83) is lost. This symmetry has been broken ‘spontaneously’ — though 
this is still only a classical analogue. Nevertheless, the model is essentially 
the Landau mean field theory of ferromagnetism, and suggests that we should 
think of the ‘symmetric’ and ‘broken symmetry’ situations as different phases 
of the same system. It may also be the case in particle physics, that parame- 
ters such as u? change sign as a function of T, or some other variable, thereby 
effectively precipitating a phase change. 

If we maintain the idea that the vacuum expectation value of the quantum 
field should equal the ground state value of the classical field, the vacuum in 
this u? < 0 case must therefore be |0)g such that (0|¢|0)p does not vanish, 
in contrast to (17.76). It is clear that this is exactly the situation met in the 
superfluid (but ‘B’ here will stand for ‘broken symmetry’), and is moreover 
the condition for the existence of massless (Goldstone) modes. Let us see how 
they emerge in this model. 

In quantum field theory, particles are thought of as excitations from a 
ground state, which is the vacuum. Figure 17.2 strongly suggests that if we 
want a sensible quantum interpretation of a theory with the potential (17.77), 
we had better expand the fields about a point on the circle of minima, about 
which stable oscillations are likely, rather than about the obviously unstable 
point $ = 0. Let us pick the point p = v, 0 = 0 in the classical case. We might 
well guess that ‘radial’ oscillations in 6 would correspond to a conventional 
massive field (having a parabolic restoring potential), while ‘angle’ oscillations 
in 6 — which pass through all the degenerate vacuua — have no restoring force 
and are massless. Accordingly, we set 


Gy ze CATA (17.84) 


and find (problem 17.5) that La (with V = Vsg of (17.77) with hats on) 
becomes 


A Dated Pees ee ee 
La = zuho" h — ph? + 30,096 + pi /Ă 
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Be o po i sir 
+ 29,001 + 53000 - Zoh? = Ta (17.85) 
Equation (17.85) is very important. First of all, the first line shows that the 
particle spectrum in the ‘spontaneously broken’ case is dramatically different 
from that in the normal case: instead of two degrees of freedom with the same 
mass u, one (the 0-mode) is massless, and the other (the h-mode) has a mass 
of V2pu. We expect the vacuum |0)g to be annihilated by the mode operators 
Gp, and Gg for these fields. This implies, however, that 


B(0|¢|0)p = v/v2 (17.86) 


which is consistent with our interpretation of the vacuum expectation value 
(vev) as the classical minimum, and with the occurrence of massless modes. 
(The constant term in (17.85), which does not affect equations of motion, 
merely reflects the fact that the minimum value of Vsp is —p*/A.) The 
ansatz (17.84) and the non-zero vev (17.86) may be compared with (17.37) 
and (17.52), respectively, in the superfluid case. 

Secondly, the second line of equation (17.85) shows that only the derivative 
of the 6 field appears in the interaction terms, whereas this is not true of the h 
field. Indeed, the Lagrangian for the 6-mode cannot have any dependence on 
a constant value of 6, since this could be transformed away by a global U(1) 
transformation (17.72), which is a symmetry of the theory, and under which 
0 — Ê+ va. This will be an important point to remember when we consider 
effective Lagrangians for Goldstone modes in section 18.3. 

Goldstone's model, then, contains much of the essence of spontaneous 
symmetry breaking in field theory: a non-zero vacuum value of a field which 
is not an invariant under the symmetry group, zero mass bosons, and massive 
excitations in a direction in field space which is ‘orthogonal’ to the degenerate 
ground states. However, it has to be noted that the triggering mechanism for 
the symmetry breaking (u? — —p?) has to be put in by hand, in contrast to 
the — admittedly approximate, but more ‘dynamical’ — Bogoliubov approach. 
The Goldstone model, in short, is essentially phenomenological. 

As in the case of the superfluid, we may perfectly well choose a vacuum 


corresponding to a classical ground state with non-zero 0, say 0 = —va. Then 
Pi wt 
0, a|ọ|0, a = e **— 17.87 
B(0, aló]0, a) i (17.87) 
= e'%,(0|d0)z, (17.88) 


as in (17.57). But we know (see (7.27) and (7.28)) that 
eit = ¢' =U,dU,! (17.89) 


where _ 
LE setei (17.90) 


216 17. Spontaneously Broken Global Symmetry 


So (17.88) becomes 
B (0, aJó]0, a)a =B(0|UadU, '|0)p (17.91) 


and we may interpret Uz!|0)p as the ‘alternative vacuum’ |0,0)g (this ar- 
gument is, as usual, not valid in the infinite volume limit where No fails to 
exist). 

It is interesting to find out what happens to the symmetry current cor- 
responding to the invariance (17.72), in the ‘broken symmetry’ case. This 
current is given in (7.23) which we write again here in slightly different nota- 
tion: 


ji = i(t — (0"6)!9), (17.92) 


normal ordering being understood. Written in terms of the h and 6 of (17.84), 
J becomes 


jh = v0" + 2h0"0 + h200/v. (17.93) 


The term involving just the single field Ê is very remarkable: it tells us that 
there is a non-zero matrix element of the form 


s (0/35 (x)|8,p) = —ip ve”? (17.94) 


where |0, p) stands for the state with one 0-quantum (Goldstone boson), with 
momentum p”. This is easily seen by writing the usual normal mode expansion 
for Ô, and using the standard bosonic commutation relations for âg(k), ah(k’). 
In words, (17.94) asserts that, when the symmetry is spontaneously broken, 
the symmetry current connects the vacuum to a state with one Goldstone 
quantum, with an amplitude which is proportional to the symmetry breaking 
vacuum expectation value v, and which vanishes as the 4-momentum goes to 
zero. The matrix element (17.94), with x = 0, is precisely of the type that was 
shown to be non-zero in the proof of the Goldstone theorem, after (17.67). 
Note also that (17.94) is consistent with dj = 0 only if p? = 0, as is required 
for the massless 6. 

We are now ready to generalize the Abelian U(1) model to the (global) 
non-Abelian case. 


E: oooB o oops 


17.6 Spontaneously broken global non-Abelian 
symmetry 
We can illustrate the essential features by considering a particular example, 


which in fact forms part of the Higgs sector of the Standard Model. We 
consider an SU(2) doublet, but this time not of fermions as in section 12.3, 
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4 1 A 
Q 7 (03 + iba) 

where the complex scalar field bt destroys positively charged particles and 
creates negatively charged ones, and the complex scalar field po destroys neu- 
tral particles and creates neutral antiparticles. As we shall see in a moment, 
the Lagrangian we shall use has an additional U(1) symmetry, so that the 
full symmetry is SU(2) x U(1). This U(1) symmetry leads to a conserved 
quantum number which we call y. We associate the physical charge Q with 
the eigenvalue tz of the SU(2) generator fs, and with y, via 


Q = e(t3 + y/2) (17.96) 


so that y($+) = 1 = y(¢°). Thus $* and ¢° can be thought of as analogous 
to the hadronic iso-doublet (K+, K0). 
The Lagrangian we choose is a simple generalization of (17.69) and (17.77): 


but of bosons: 


Le = (3 GtG) + oto — 21319) (17.97) 


which has the ‘spontaneous symmetry breaking’ choice of sign for the param- 
eter 2. Plainly, for the ‘normal’ sign of y2, in which ‘+2¢'@’ is replaced by 
‘261’, with u? positive in both cases, the free (A = 0) part would describe 
a complex doublet, with four degrees of freedom, each with the same mass p. 
Let us see what happens in the broken symmetry case. 

For the Lagrangian (17.97) with u? > 0, the minimum of the classical 
potential is at the point 


(6) min = 2u?/d = v?/2. (17.98) 
As in the U(1) case, we interpret (17.98) as a condition on the vev of ọtọ, 
(0|d'd|0) = v?/2. (17.99) 


Before proceeding we note that (17.97) is invariant under global SU(2) trans- 
formations _ l i 
¿> $ = expl-ia 7/26 (17.100) 


but also under a separate global U(1) transformation 
$ > $ = exp(—ia)d (17.101) 


where a is to be distinguished from a = (a1, Q2,a3). The symmetry is then 
referred to as SU(2) x U(1), which is the symmetry of the electroweak sector 
of the Standard Model, except that in that case it is a local symmetry. 

As before, in order to get a sensible particle spectrum we must expand the 
fields ¢ not about ¢ = 0 but about a point satisfying the stable ground state 
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(vacuum) condition (17.98). That is, we need to define ‘(0|¢|0)’ and expand 
about it, as in (17.84). In the present case, however, the situation is more 
complicated than (17.84) since the complex doublet (17.95) contains four real 
fields as indicated in (17.95), and (17.98) becomes 


(0197 + 63 + 63 + G30) = v2. (17.102) 


It is evident that we have a lot of freedom in choosing the (0|6;]0) so that 
(17.102) holds, and it is not at first obvious what an appropriate generalization 
of (17.84) and (17.85) might be. 

Furthermore, in this more complicated (non-Abelian) situation a qual- 
itatively new feature can arise: it may happen that the chosen condition 
(0|4;|0) 4 0 is invariant under some subset of the allowed symmetry trans- 
formations. This would effectively mean that this particular choice of the 
vacuum state respected that subset of symmetries, which would therefore not 
be ‘spontaneously broken’ after all. Since each broken symmetry is associated 
with a massless Goldstone boson, we would then get fewer of these bosons 
than expected. Just this happens (by design) in the present case. 

Suppose, then, that we could choose the (0|¢;|0) so as to break this SU(2) 
x U(1) symmetry completely: we would then expect four massless fields. 
Actually, however, it is not possible to make such a choice. An analogy may 
make this point clearer. Suppose we were considering just SU(2), and the field 
$” was an SU(2)-triplet, @. Then we could always write (0|@|0) = un where 
n is a unit vector; but this form is invariant under rotations about the n-axis, 
irrespective of where that points. In the present case, by using the freedom 
of global SU(2) x U(1) phase changes, an arbitrary (0|¢|0) can be brought to 


the form 
(0|¢|0) = ( „AB ) (17.103) 


In considering what symmetries are respected or broken by (17.103), it is easi- 
est to look at infinitesimal transformations. It is then clear that the particular 
transformation 

6b = —ie(1 + 73)$ (17.104) 


(which is a combination of (17.101) and the “third component’ of (17.100)) is 
still a symmetry of (17.103) since 


ar ya) a) (17.105) 


(0|ġ|0) = (0|¢ + 5610); (17.106) 


we say that ‘the vacuum is invariant under (17.104)’, and when we look at 
the spectrum of oscillations about that vacuum we expect to find only three 
massless bosons, not four. 


so that 
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Oscillations about (17.103) are conveniently parametrized by 


Ñ P 0 
$ = exp(—ið (x) - 7/2v) | lu + Ala) ; (17.107) 


which is to be compared with (17.84). Inserting (17.107) into (17.97) (see 
problem 17.6) we easily find that no mass term is generated for the @ fields, 
while the H field piece is 


A 1 A A x 
n = zn Ho" H — pH? + interactions (17.108) 


just as in (17.85), showing that my = V2u. 

Let us now note carefully that whereas in the ‘normal symmetry’ case 
with the opposite sign for the u? term in (17.97), the free-particle spectrum 
consisted of a degenerate doublet of four degrees of freedom all with the same 
mass 1, in the ‘spontaneously broken’ case no such doublet structure is seen: 
instead, there is one massive scalar field, and three massless scalar fields. 
The number of degrees of freedom is the same in each case, but the physical 
spectrum is completely different. 

In the application of this to the electroweak sector of the Standard Model, 
the SU(2) x U(1) symmetry will be ‘gauged’ (i.e. made local), which is easily 
done by replacing the ordinary derivatives in (17.97) by suitable covariant 
ones. We shall see in chapter 19 that the result, with the choice (17.107), 
will be to end up with three massive gauge fields (those mediating the weak 
interactions) and one massless gauge field (the photon). We may summarize 
this (anticipated) result by saying, then, that when a spontaneously broken 
non-Abelian symmetry is gauged, those gauge fields corresponding to symme- 
tries that are broken by the choice of (0|¢|0) acquire a mass, while those that 
correspond to symmetries that are respected by (0|¢|0) do not. Exactly how 
this happens will be the subject of chapter 19. 

We end this chapter by considering a second important example of spon- 
taneous symmetry breaking in condensed matter physics, as a preliminary to 
our discussion of chiral symmetry breaking in the following chapter. 


Ca 


17.7 The BCS superconducting ground state 


We shall not attempt to provide a self-contained treatment of the Bardeen- 
Cooper-Schrieffer (1957) — or BCS — theory; rather, we wish simply to focus 
on one aspect of the theory, namely the occurrence of an energy gap separating 
the ground state from the lowest excited levels of the fermionic energy spec- 
trum. The existence of such a gap is a fundamental ingredient of the theory 
of superconductivity; in the following chapter we shall see how Nambu (1960) 
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interpreted a chiral symmetry breaking fermionic mass term as an analogous 
‘gap’. We emphasize at the outset that we shall here not treat electromagnetic 
interactions in the superconducting state, leaving that topic for chapter 19. 

Our discussion will deliberately have some similarity to that of section 
17.3.2. In the present case, of course, we shall be dealing with fermions — 
namely electrons — rather than the bosons of a superfluid. Nevertheless, we 
shall see that a similar kind of ‘condensation’ occurs in the superconductor 
too. Naturally, such a phenomenon can only occur for bosons. Thus an essen- 
tial element in the BCS theory is the identification of a mechanism whereby 
pairs of electrons become correlated, the behaviour of which may have some 
similarity to that of bosons. Now, direct Coulomb interaction between a pair 
of electrons is repulsive, and it remains so despite the screening that occurs 
in a solid. But the positively charged ions do provide sources of attraction 
for the electrons, and may be used as intermediaries (via ‘electron-phonon 
interactions’) to promote an effective attraction between electrons in certain 
circumstances. At this point we recall the characteristic feature of a weakly 
interacting gas of electrons at zero temperature: thanks to the Exclusion Prin- 
ciple, the electrons populate single particle energy levels up to some maximum 
energy Ep (the Fermi energy), whose value is fixed by the electron density. It 
turns out (see for example Kittel 1987, chapter 8) that electron-electron scat- 
tering, mediated by phonon exchange, leads to an effective attraction between 
two electrons whose energies ex lie in a thin band Ep — wp < €x < Ep + wp 
around Ep, where wp is the Debye frequency associated with lattice vibra- 
tions. Cooper (1956) was the first to observe that the Fermi ‘sea’ was unstable 
with respect to the formation of bound pairs, in the presence of an attractive 
interaction. What this means is that the energy of the system can be lowered 
by exciting a pair of electrons above Ep, which then become bound to a state 
with a total energy less than 2Ep. This instability modifies the Fermi sea in a 
fundamental way: a sort of ‘condensate’ of pairs is created around the Fermi 
energy, and we need a many-body formalism to handle the situation. 

For simplicity we shall consider pairs of equal and opposite momentum 
k, so their total momentum is zero. It can also be argued that the effective 
attraction will be greater when the spins are antiparallel, but the spin will 
not be indicated explicitly in what follows: ‘k’ will stand for ‘k with spin up’, 
and ‘—k’ for ‘—k with spin down’. With this by way of motivation, we thus 
arrive at the BCS reduced Hamiltonian 


Hnos = 5 exp ck -V 5 Gel E RER (17.109) 
k kk 


which is the starting point of our discussion. In (17.109), the ĉ’s are fermionic 
operators obeying the usual anticommutation relations, and the ground state 
is such that ¢,|0) = 0. The sum is over states lying near Ep, as above, and 
the single particle energies ez are measured relative to Ep. The constant V 
(with the minus sign in front) represents a simplified form of the effective 
electron-electron attraction. Note that, in the non-interacting (V = 0) part, 
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Chef is the number operator for the electrons, which because of the Pauli 
Principle has eigenvalues 0 or 1; this term is of course completely analogous 
to (7.55), and sums the single particle energies ez for each occupied level. 
We immediately note that Hpos is invariant under the global U(1) trans- 
formation 
êk > e = ez (17.110) 


for all k, which is equivalent to 7’ (a) = e~!%y)(a) for the electron field operator 
at x. Thus fermion number is conserved by Hgcs. However, just as for 
the superfluid, we shall see that the BCS ground state does not respect the 
symmetry. 

We follow Bogoliubov (1958) and Bogoliubov et al. (1959) (see also Valatin 
1958), and make a canonical transformation on the operators cp, at k similar 


to the one employed for the superfluid problem in (17.38), as motivated by 
the ‘pair condensate’ picture. We set 


Br = ukêk = opel ps Bh. = Une), AS UKC_k 
Pg = ure_ptrrd,, Bl, =upél, +orég (17.111) 


where uz and vz are real, depend only on k = |k|, and are chosen so as to 
preserve anticommutation relations for the P”s. This last condition implies 
(problem 17.7) 

u+ =1 (17.112) 


so that we may conveniently set 
Uuk = COS Oz, Uk = sin Ôk. (17.113) 


Just as in the superfluid case, the transformations (17.111) only make sense in 
the context of a number non-conserving ground state, since they do not respect 
the symmetry (17.110). Although Hgcg of (17.109) is number conserving, we 
shall shortly make a crucial number non-conserving approximation. 

We seek a diagonalization of (17.109), analogous to (17.40), in terms of 
the mode operators $ and fi: 


Hgcs = N we (Bp Bx as Ai fr) PY (17.114) 
k 


It is easy to check (problem 17.8) that the form (17.114) implies 
[Hpcs, Bj] = wi} (17.115) 


as in (17.41), despite the fact that the operators obey anticommutation rela- 
tions. Equation (17.115) then implies that the wz are the energies of states 
created by the quasiparticle operators Bh and pi k the ground state being 
defined by A y 

Pp. lground) Bos = B_y|ground)Bes = 0. (17.116) 
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Substituting for BI in (17.115) from (17.111) we therefore require 


[Hzcs, COS 0; êl — sin 6, e] = Y (cos 0; êl — sin 6, €] (17.117) 
which must hold as an identity in the ĉ’s and ¢)’s. Evaluating (17.117) one 
obtains (problem 17.9) 


(wi — €) cos, — V sin 6; Lk ĉ_kĉêk =0 (17.118) 
-V cosh Sp êlêt y, + (wi + es) sin 8, = 0. (17.119) 


It is at this point that we make the crucial ‘condensate’ assumption: we 
replace the operator expressions )°p, €_gép and X` k ee). k by their average 
values, which are assumed to be non-zero in the ground state. Since these 
operators carry fermion number +2, it is clear that this assumption is only 
valid if the ground state does not, in fact, have a definitive number of particles 
— just as in the superfluid case. We accordingly make the replacements 


Vk C_kCk > V Bos (ground]| Lk ĉ_kêklground) Bcs =A (17.120) 
Vk oeg > V gcs(ground| X k ĉl ĉl , |ground) gos = A*(17.121) 


In that case, equations (17.118) and (17.119) become 


wcosó = «cos; + Asin (17.122) 
wsinó = —e sin; + A* cos 6; (17.123) 


which are consistent if 


wy = [e + |AI?]!/. (17.124) 


Equation (17.124) is the fundamental result at this stage. Recalling that ej 
is measured relative to Ep, we see that it implies that all excited states are 
separated from Ep by a finite amount, namely |A]. 

In interpreting (17.124) we must however be careful to reckon energies for 
an excited state as relative to a BCS state having the same number of pairs, if 
we consider experimental probes which do not inject or remove electrons. Thus 
relative to a component of |groundypes with N pairs, we may consider the 
excitation of two particles above a BCS state with N — 1 pairs. The minimum 
energy for this to be possible is 2|A|. It is this quantity which is usually called 
the energy gap. Such an excited state is represented by BB! kl ground) cs. 

We shall need the expressions for cos 0; and sin 0; which may be obtained 
as follows. Squaring (17.122), and taking A now to be real and equal to |A], 
we obtain 

|A|?(cos? 0, — sin? 6;) = 2e;|A| cos 0; sin 6, (17.125) 


which leads to 
tan 20, = |A]/e (17.126) 
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and then 


1 1/2 1 1/2 
cos 0; = E (1 + 2) , sind, = E (1 — “)| ; (17.127) 
wi wi 


All our experience to date indicates that the choice ‘A = real’ amounts to a 
choice of phase for the ground state value: 


V nos (ground] Y é_p,cp|ground)gcs = |A]. (17.128) 
k 


By making use of the U(1) symmetry (17.110), other phases for A are equally 
possible. 

The condition (17.128) has, of course, the by now anticipated form for a 
spontaneously broken U(1) symmetry, and we must therefore expect the oc- 
currence of a massless mode (which we do not demonstrate here). However, 
we may now recall that the electrons are charged, so that when electromag- 
netic interactions are included in the superconducting state, we have to allow 
the a in (17.110) to become a local function of a. At the same time, the 
massless photon field will enter. Remarkably, we shall learn in chapter 19 
that the expected massless (Goldstone) mode is, in this case, not observed: 
instead, that degree of freedom is incorporated into the gauge field, rendering 
it massive. As we shall see, this is the physics of the Meissner effect in a 
superconductor, and that of the “Higgs mechanism” in the Standard Model. 
Thus in the (charged) BCS model, both a fermion mass and a gauge boson 
mass are dynamically generated. 

An explicit formula for A can be found by using the definition (17.120), 
together with the expression for €7, found by inverting (17.111): 


Cp, = cos Or By + sin Or $? p (17.129) 
This gives, using (17.120) and (17.129), 


lA] = V Bos(ground] S (cos 0xB_ y + sin 0k BL) 
k 


x (cos Ox Br + sin ôk Bt) |ground)Bcs 


= V Bos(ground| 5 cos 6; sin 98_ 6" pleround) es, 
k 
A 
= V US 
2 E+ APT? 


(17.130) 


The sum in (17.130) is only over the small band Ep — wp < ex < Er + wp 
over which the effective electron-electron attraction operates. Replacing the 
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sum by an integral, we obtain the gap equation 


1 “D d 
Îi -S sv Ne | SE 
2 -wp le? + |A/?]2 


= VNpsinh *(wp/|Al) (17.131) 


where NF is the density of states at the Fermi level. Equation (17.131) yields 


WD 


ty 2wpe7!/ VNF 17.132 
sinh(1/VNp) > “DO ( ) 


|A| = 
for VNp < 1. This is the celebrated BCS solution for the gap parameter 
|A|. Perhaps the most significant thing to note about it, for our purpose, is 
that the expression for |A| is not an analytic function of the dimensionless 
interaction parameter V Np (it cannot be expanded as a power series in this 
quantity), and so no perturbative treatment starting from a normal ground 
state could reach this result. The estimate (17.132) is in reasonably good 
agreement with experiment, and may be refined. 

The explicit form of the ground state in this model can be found by a 
method similar to the one indicated in section 17.3.2 for the superfluid. Since 
the transformation from the ¢’s to the Bs is canonical, there must exist a 
unitary operator which effects it via (compare (17.48)) 


Unos êp Úbos = Êk, Uscsé' pios = fi p (17.133) 


The operator Unos is (Blatt 1964 section V.4, Yosida 1958, and compare 
problem 17.4) 


Unos = ] [ expl9r(é,¿t y — ¿gé_g)l. (17.134) 
k 


Then, since €,|0) = 0, we have 
Ut osBrÚUrcs|0) = 0 (17.135) 
showing that we may identify 
leround)Bcs = Ûgcsl0) (17.136) 


via the condition (17.116). When the exponential in Unos is expanded out, 
and applied to the vacuum state |0}, great simplifications occur. Consider the 
operator 

êp = ELE y — Ek€_k- (17.137) 


8 = —€,€ Cpe — EpC Cy | ( 38) 
k k k ke k Cee ke C 17.1 8 
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so that 57, 10) = —|0). It follows that 


0 ge 
exp(9x.8,,)|0) = (1 + Okêk a r — 35k De .)|0) 
= (cosO; + sin 0, 8p,)]0) 
= (cos, + sing ĉj 2! ,)|0) (17.139) 
and hence 
|ground) pcs = [[(cos Ok + sin Ok e 0 (17.140) 


k 


As for the superfluid, (17.140) represents a coherent superposition of corre- 
lated pairs, with no restraint on the particle number. 

We should emphasize that the above is only the barest outline of a simple 
version of BCS theory, with no electromagnetic interactions, from which many 
subtleties have been omitted. Consider, for example, the binding energy Ep 
of a pair, to calculate which one needs to evaluate the constant y in (17.114). 
To a good approximation one finds (see for example Enz 1992) E, x 3A?/Ep. 
One can also calculate the approximate spatial extension of a pair, which is 
denoted by the coherence length € and is of order vp/TA where kp = mvp 
is the Fermi momentum. If we compare FE» to the Coulomb repulsion at a 
distance € we find 


Ep/(a/§) ~ ao/£ (17.141) 


where ay is the Bohr radius. Numerical values show that the right-hand side 
of (17.141), in conventional superconductors, is of order 1073. Hence the pairs 
are not really bound, only correlated, and as many as 10% pairs may have their 
centres of mass within one coherence length of each other. Nevertheless, the 
simple theory presented here contains the essential features which underlie all 
attempts to understand the dynamical occurrence of spontaneous symmetry 
breaking in fermionic systems. 
We now proceed to an important application in particle physics. 


——— SS A A 
Problems 
17.1 Verify (17.29). 
17.2 Verify (17.35). 
17.3 Derive (17.43) and (17.44). 
17.4 Let Ă 1 
Û, = exp[A0(a? — â12)] 


where [â, ât] = 1 and A, 0 are real parameters. 
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(a) Show that Uy is unitary. 


(b) Let 7 da i A y 
l,=Ú,40,*, and J =Ú,xaÚ,.. 
Show that N 
di 3 
dÀ A 
and that A 
dÊ 2; 
qe = Ple 


(c) Hence show that 
Ty = cosh(A0) â + sinh(A0) îi, 
and thus finally (compare (17.38) and (17.48)) that 
0140, * =cosh0 â + sinh 0 ât = â 
and 
Û â ÛT! = sinh 8 â + cosh 0 ât = at, 


where 1 
Û = Un = exp[50(@” — al?)], 


17.5 Insert the ansatz (17.84) for ¢ into La of (17.69), with Y = Vsp of 
(17.77), and show that the result for the constant term, and the quadratic 
terms in h and 0, is as given in (17.85). 


17.6 Verify that when (17.107) is inserted in (17.97), the terms quadratic in 
the fields H and 0 reveal that 0 is a massless field, while the quanta of the H 
field have mass y2p. 


17.7 Verify that the 6’s of (17.111) satisfy the required anticommutation 
relations if (17.112) holds. 


17.8 Verify (17.115). 
17.9 Derive (17.118) and (17.119). 
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Chiral Symmetry Breaking 


In section 12.4.2 we arrived at a puzzle: there seemed good reason to think 
that a world consisting of u and d quarks and their antiparticles, interacting 
via the colour gauge fields of QCD, should exhibit signs of the non-Abelian 
chiral symmetry SU(2)¢5, which was exact in the massless limit my, ma > 0. 
But, as we showed, one of the simplest consequences of such a symmetry 
should be the existence of nucleon parity doublets, which are not observed. 
We can now resolve this puzzle by making the hypothesis (section 18.1) first 
articulated by Nambu (1960) and Nambu and Jona-Lasinio (1961a), that this 
chiral symmetry is spontaneously broken as a dynamical effect — presumably, 
from today’s perspective, as a property of the QCD interactions, as discussed 
in section 18.1.1. If this is so, an immediate physical consequence should be 
the appearance of massless (Goldstone) bosons, one for every symmetry not 
respected by the vacuum. Indeed, returning to (12.168) which we repeat here 
for convenience, 


a(a _ 
TP |d) = lä), (18.1) 


we now interpret the state |ù) (which is degenerate with |d)) as |d + +”) 
where ‘r’ is a massless particle of positive charge, but a pseudoscalar (07) 
rather than a scalar (0*) since, as we saw, |ú) has opposite parity to |u). In 


the same way, ‘m~’ and ‘7°’ will be associated with PO and TO. Of course, 
no such massless pseudoscalar particles are observed: but it is natural to hope 
that when the small up and down quark masses are included, the real pions 
(nt, n,n?) will emerge as ‘anomalously light’, rather than strictly massless. 
This is indeed how they do appear, particularly with respect to the octet of 
mesons, which differ only in qq spin alignment from the 07 octet. As Nambu 
and Jona-Lasinio (1961a) said, ‘it is perhaps not a coincidence that there 
exists such an entity [i.e. the Goldstone state(s)] in the form of the pion’. 

If this was the only observable consequence of spontaneously breaking chi- 
ral symmetry, it would perhaps hardly be sufficient grounds for accepting 
the hypothesis. But there are two circumstances which greatly increase the 
phenomenological implications of the idea. First, the vector and axial vec- 


1 1 
tor symmetry currents po" and poe of the u-d strong interaction SU(2) 
symmetries (see (12.109) and (12.165)) happen to be the very same currents 
which enter into strangeness-conserving semileptonic weak interactions (such 
as n > pe De anda — pu D,), as we shall see in chapter 20. Thus some re- 
markable connections between weak- and strong-interaction parameters can be 
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established, such as the Goldberger—Treiman (1958) relation (see section 18.2) 
and the Adler—Weisberger (Adler 1965, Weisberger 1965) relation. Second, it 
turns out that the dynamics of the Goldstone modes, and their interactions 
with other hadrons such as nucleons, are strongly constrained by the under- 
lying chiral symmetry of QCD; indeed, surprisingly detailed effective theories 
(see section 18.3) have been developed, which provide a very successful de- 
scription of the low energy dynamics of the Goldstone degrees of freedom. 
Finally we shall introduce the subject of chiral anomalies in section 18.4. 

It would take us too far from our main focus on gauge theories to pursue 
these interesting avenues in any detail. But we hope to convince the reader, in 
this chapter, that chiral symmetry breaking is an integral part of the Standard 
Model, being a fundamental property of QCD. 


E 


18.1 The Nambu analogy 


We recall from section 12.4.2 that for ‘almost massless’ fermions it is natural 
to use the representation (3.40) for the Dirac matrices, in terms of which the 
Dirac equation reads 


EQ = o-pdo+myx (18.2) 
Ey = -o-px+me¢. (18.3) 


Nambu (1960) and Nambu and Jona-Lasinio (1961a) pointed out a remarkable 
analogy between (18.2) and (18.3) and equations (17.122) and (17.123) which 
describe the elementary excitations in a superconductor (in the case A is real), 
and which we repeat here for convenience: 


wcosó = «cos; + Asin; (18.4) 
wsin 6, = —e, sind; + Acos6.. (18.5) 


In (18.4) and (18.5), cos ð; and sin, are respectively the components of the 
electron destruction operator ¢ and the electron creation operator & l in the 


quasiparticle operator Êi (see (17.111)): 
Êi = cos 0; ĉl — sin 0; ore (18.6) 


The superposition in Êi combines operators which transform differently under 
the U(1) (number) symmetry. The result of this spontaneous breaking of the 
U(1) symmetry is the creation of the gap A (or 2A for a number-conserving 
excitation), and the appearance of a massless mode. If A vanishes, (17.126) 
implies that 6; = 0, and we revert to the symmetry-respecting operators 
CL ey Consider now (18.2) and (18.3). Here ¢ and x are the components of 
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The type of fermion-antifermion in the ‘Nambu chiral condensate’. 


definite chirality in the Dirac spinor w (compare (12.149)), which is itself not 
a chirality eigenstate when m Z 0. When m vanishes, the Dirac equation for 


w decouples into two separate ones for the chirality eigenstates dg = ( a ) 


and ¢, = ( a i Nambu therefore made the following analogy: 


Superconducting gap parameter A Dirac mass m 


+ 
quasiparticle excitation <> massive Dirac particle 
o 


U(1) number symmetry U(1)5 chirality symmetry 


Goldstone mode +» massless boson. 


In short, the mass of a Dirac particle arises from the (presumed) spontaneous 
breaking of a chiral (or 35) symmetry, and this will be accompanied by a 
massless boson. 

Before proceeding we should note that there are features of the analogy, 
on both sides, which need qualification. First, the particle symmetry we want 
to interpret this way is SU(2)¢5 not U(1)5, so the appropriate generalization 
(Nambu and Jona-Lasinio 1961b) has to be understood. Second, we must 
again note that the BCS electrons are charged, so that in the real supercon- 
ducting case we are dealing with a spontaneously broken local U(1) symmetry, 
not a global one. By contrast, the SU(2)f5 chiral symmetry is not gauged. 

As usual, the quantum field theory vacuum is analogous to the many- 
body ground state. According to Nambu’s analogy, therefore, the vacuum 
for a massive Dirac particle is to be pictured as a condensate of correlated 
pairs of massive fermions. Since the vacuum carries neither linear nor angular 
momentum, the members of a pair must have equal and opposite spin: they 
therefore have the same helicity. However, since the vacuum does not violate 
fermion number conservation, one has to be a fermion and the other an an- 
tifermion. This means (recalling the discussion after (12.147)) that they have 
opposite chirality. Thus a typical pair in the Nambu vacuum is as shown in 
figure 18.1. We may easily write down an expression for the Nambu vacuum, 
analogous to (17.140) for the BCS ground state. Consider solutions fp, and 
X+ of positive helicity in (18.2) and (18.3); then 


Ep+ = |plo+ + mx+ (18.7) 
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Ex+ = —I|plx+ +m. (18.8) 


Comparing (18.7) and (18.8) with (18.4) and (18.5), we can read off the mixing 
coefficients cos 6, and sin 6, as (cf (17.127)) 


1 BNT? 
cos6, = E (1+2)| (18.9) 


sind, = E ( = il)" (18.10) 


where E = (m? + p?)!/?. The Nambu vacuum is then given by! 


l0)x = Į [(cos 6, — sin 4,24 (p)d}(—p))|0)m=o, (18.11) 
D,s 


where ¿Í's and di's are the operators in massless Dirac fields. Depending on 
the sign of the helicity s, each pair in (18.11) carries +2 units of chirality. We 
may check this by noting that in the mode expansion of the Dirac field i), 
ĉs(p) operators go with u-spinors for which the y; eigenvalue equals the helic- 
ity, while di(-p) operators accompany v-spinors for which the y5 eigenvalue 
equals minus the helicity. Thus under a chiral transformation Y! = e7iBY q), 
E, — e PSE, and di > elsdi, for a given s. Hence cdi acquires a factor 
e2iPs. Thus the Nambu vacuum does not have a definite chirality, and oper- 
ators carrying non-zero chirality can have non-vanishing vacuum expectation 


values. A mass term oo is of just this kind, since under w = eB 15a) we find 
vid > pte ei 154) = ye 25674). Thus, in analogy with (17.120), a 
Dirac mass is associated with a non-zero value for y(0|44]0)y. 

In the original conception by Nambu and co-workers, the fermion under 
discussion was taken to be the nucleon, with ‘m’ the (spontaneously gener- 
ated) nucleon mass. The fermion—fermion interaction — necessarily invariant 
under chiral transformations — was taken to be of the four-fermion type. As 
we have seen in volume 1, this is actually a non-renormalizable theory, but a 
physical cut-off was employed, somewhat analogous to the Fermi energy Er. 
Thus the nucleon mass could not be dynamically predicted, unlike the anal- 
ogous gap parameter A in BCS theory. Nevertheless, a gap equation similar 
to (17.131) could be formulated, and it was possible to show that when it 
had a non-trivial solution, a massless bound state automatically appeared in 
the ff channel (Nambu and Jona-Lasinio 1961a). This work was generalized 
to the SU(2)rs5 case by Nambu and Jona-Lasinio (1961b), who showed that 
if the chiral symmetry was broken explicitly by the introduction of a small 
nucleon mass (~ 5 MeV), then the Goldstone pions would have their observed 
non-zero (but small) mass. In addition, the Goldberger—Treiman (1958) re- 
lation was derived, and a number of other applications were suggested. Sub- 
sequently, Nambu with other collaborators (Nambu and Lurie 1962, Nambu 


LA different phase convention is used for di (—p) as compared to that for Greg in (17.111). 
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and Schrauner 1962) showed how the amplitudes for the emission of a single 
‘soft’ (nearly massless, low momentum) pion could be calculated, for various 
processes. These developments culminated in the Adler-Weisberger relation 
(Adler 1965, Weisberger 1965) which involves two soft pions. 

This work was all done in the absence of an agreed theory of the strong 
interactions (the NJ-L theory was an illustrative working model of dynami- 
cally generated spontaneous symmetry breaking, but not a complete theory 
of strong interactions). QCD became widely accepted as that theory around 
1973. In this case, of course, the ‘fermions in question’ are quarks, and the 
interactions between them are gluon exchanges, which conserve chirality as 
noted in section 12.4.2. The bulk of the masses of the qqq bound states which 
form baryons is then interpreted as being spontaneously generated, while a 
small explicit quark mass term in the Lagrangian is responsible for the non- 
zero pion mass. Let us therefore now turn to two-flavour QCD. 


18.1.1 Two flavour QCD and SU(2)11xSU(2)rr 
Let us begin with the massless case, for which the fermionic part of the La- 
grangian is _ 

Ly = âipâ + dipd (18.12) 
where 4 and d now stand for the field operators, 


DY = 0" + 19,4/2- A”, (18.13) 


and the A matrices act on the colour (r,b,g) degree of freedom of the u and d 
quarks. This Lagrangian is invariant under 


(i) U(1)¢ ‘quark number’ transformations 
d — ea; (18.14) 
(ii) SU(2)¢ ‘flavour isospin’ transformations 
å > exp(—ia - 7/2) q; (18.15) 
(iii) U(1)¢5 “axial quark number’ transformations 
ĝ — eG; (18.16) 
(iv) SU(2)¢5 ‘axial flavour isospin’ transformations 
¿> exp(—iG - 7/275) d, (18.17) 


where 


E ) (18.18) 
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Symmetry (i) is unbroken, and its associated ‘charge’ operator (the quark 
number operator) commutes with all other symmetry operators, so it need 
not concern us further. Symmetry (ii) is the standard isospin symmetry of 
chapter 12, explicitly broken by the electromagnetic interactions (and by the 
difference in the masses m, and ma, when included). Symmetry (iii) does 
not correspond to any known conservation law; on the other hand, there are 
not any near-massless isoscalar 0” mesons, either, such as must be present 
if the symmetry is spontaneously broken. The 7 meson is an isoscalar 07 
meson, but with a mass of 547 MeV it is considerably heavier than the pion. 
In fact, it can be understood as one of the Goldstone bosons associated with 
the spontaneous breaking of the larger group SU(3)¢5, which includes the s 
quark (see section 18.3.3). In that case, the symmetry (iii) becomes extended 
to 

û— ma d+ eid, ¿>e 33, (18.19) 


but there is still a missing light isoscalar 0” meson. It can be shown that 
its mass must be less than or equal to v3 mz (Weinberg 1975), but no such 
particle exists. This is the famous ‘U(1) problem’: it was resolved by ’t 
Hooft (1976a, 1986), by showing that the inclusion of instanton configurations 
(Belavin et al. 1975) in path integrals leads to violations of symmetry (iii) — 
see, for example, Weinberg (1996) section 23.5. Finally, symmetry (iv) is the 
one with which we are presently concerned. 

The symmetry currents associated with (iv) are those already given in 
(12.165), but we give them again here in a slightly different notation which 
will be similar to the one used for weak interactions: 


its = ayy 1=1,2,3. (18.20) 
Similarly the currents associated with (ii) are 
x A uli A . 
=O" ao i=1,2,3. (18.21) 
The corresponding ‘charges’ are (compare (12.166)) 

Ôi = [ise = fa wad’ x, (18.22) 

(a 

previously denoted by ips and (compare (12.101)), 


Âi = fi aa, (18.23) 


Ra 
previously denoted by TO. As with all symmetries, it is interesting to dis- 
cover the algebra of the generators, which are the six charges Q;, Qi,5 in this 
case. Patient work with the anticommutation relations for the operators in 
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q(x) and i (x) gives the results (problem 18.1) 


[QQ] = ieisnQa (18.24) 
(Â; 0j5] = icijkÂks (18.25) 
[Qis Âj] = icijrQr. (18.26) 


Relation (18.24) has been seen before in (12.101), and simply says that the 
Q;’s obey a SU(2) algebra. A simple trick reduces the rather complicated 
algebra of (18.24)-(18.26) to something much simpler. Defining 


= 


Gin = Ei tÂ) Â= EÂ- Â) (08.27) 
we find (problem 18.2) 
(Qin, Qin] = ici QR (18.28) 
Qi, Qi] = icij n. (18.29) 
Qin. Qi] = 0. (18.30) 


The operators Qin, Qi therefore behave like two commuting (independent) 
angular momentum operators, each obeying the algebra of SU(2). For this 
reason, the symmetry group of the combined symmetries (ii) and (iv) is called 
SU(2)fL x SU(2)er- 

The decoupling effected by (18.27) has a simple interpretation. Referring 
to (18.22) and (18.23), we see that 


A 1 i 
Qin = [i (5) ¿Pa (18.31) 


and similarly for Ô; L. But ((1+»75)/2) are just the projection operators Pa, 
introduced in section 12.3.2, which project out the chiral parts of any fermion 
field. Furthermore, it is easy to see that PÈ = Pr and P? = Pb, so that Qin 
and Qi can also be written as 


A at Tia A at Ti a 
Qin = | îl tin dt Qi = fata dx, (18.32) 


where Gp = ((1 + 75)/24,4L = ((1 — ¥5)/2)@. In a similar way, the currents 
(18.20) and (18.21) can be written as 


JE = jfr + dt, des = Jfr — jfi» (18.33) 
where 2. i ee 
hn = On San ity = Gur Sah. (18.34) 


Thus the SU(2) and SU(2)n refer to the two chiral components of the fermion 
fields, which is why it is called chiral symmetry. 
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Under infinitesimal SU(2) isospin and axial isospin transformations, q 
transforms by 
G7 4 =(1—ie-7/2—in- 7/275)4. (18.35) 


This can be rewritten in terms of ĝr and dr, using 


d = dn + du, dR = AR; YL = —dL. (18.36) 

We find that 
în = (L-i(e+ n): 7/2)ân (18.37) 

and similarly 
4 = (1—i(e — n) - 7/2) Gt. (18.38) 


Hence ĝr and ĝr transform quite independently”, which is why Qi, Q;1] = 
0. 

This formalism allows us to see immediately why (18.12) is chirally invari- 
ant: problem 18.3 verifies that £q can be written as 


La = dni Dar + Hi Da. (18.39) 


which is plainly invariant under (18.37) and (18.38), since D is flavour-blind. 

There is as yet no formal proof that this SU(2),xSU(2)r chiral sym- 
metry is spontaneously broken in QCD, though it can be argued that the 
larger symmetry SU(3)1 xSU(3)n — appropriate to three massless flavours — 
must be spontaneously broken (see Weinberg 1996, section 22.5). This is, of 
course, an issue that cannot be settled within perturbation theory (compare 
the comments after (17.132)). Numerical solutions of QCD on a lattice (see 
chapter 16) do provide strong evidence that baryons acquire large dynamical 
(SU(2)¢5-breaking) mass. 

Even granted that chiral symmetry is spontaneously broken in massless 
two-flavour QCD, how do we know that it breaks in such a way as to leave 
the isospin (‘R + L’) symmetry unbroken? A plausible answer can be given 
if we restore the quark mass terms via 


4 - ae 4 yi _ 
Lm = —My tit — madd = (Ma + ma)q4q — ¿(Ma — ma)qr34. 18.40) 
Now _ _ _ 
44 = 4L dr + Gp 18.41) 
and k _ 7 
4734 = ÎLT34R + ÎRT3ĂL. 18.42) 


Including these extra terms is somewhat analogous to switching on an external 
field in the ferromagnetic problem, which determines a preferred direction for 
the symmetry breaking. It is clear that neither of (18.41) and (18.42) preserves 


2We may set y = € + n, and d=e-m. 
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SU(2)LxSU(2)r since they treat the L and R parts differently. Indeed from 
(18.37) and (18.38) we find 


Gdn > dud = LC +i(e —)- 7/2)(1—i(e + n) - 7/2)4n (18.43) 
= ĝLÂR— iN: ÂLTÂR (18.44) 

and 
Irán > drán + în: ÎRTĂL. (18.45) 


Equations (18.44) and (18.45) confirm that the term d in (18.40) is invariant 
under the isospin part of SU(2),xSU(2)r (since e is not involved), but not 
invariant under the axial isospin transformations parametrized by m. The 
¿734 term explicitly breaks the third component of isospin (resembling an 
electromagnetic effect), but its magnitude may be expected to be smaller 
than that of the ¿q term, being proportional to the difference of the masses, 
rather than their sum. This suggests that the vacuum will ‘align’ in such a 
way as to preserve isospin, but break axial isospin. 


E 


18.2 Pion decay and the Goldberger—Treiman relation 


We now discuss some of the rather surprising phenomenological implications of 
spontaneously broken chiral symmetry — specifically, the spontaneous break- 
ing of the axial isospin symmetry. We start by ignoring any ‘explicit’ quark 
masses, so that the axial isospin current is conserved, ORA 5 = 0. From sections 
17.4 and 17.5 (suitably generalized) we know that this current has non-zero 
matrix elements between the vacuum and a ‘Goldstone’ state, which in our 
case is the pion. We therefore set (cf (17.94)) 


(01545(u)|m5,p) = —ip fe 278, (18.46) 


where fr is a constant with dimensions of mass, and which we expect to be 
related to a symmetry breaking vev. This is just what we shall find in section 
18.3.1. Note that (18.46) is consistent with Oua = 0 if p = 0, i.e. if the 
pion is massless. 

We treat fr as a phenomenological parameter. Its value can be determined 
from the rate for the decay n” — u*v, by the following reasoning. In chapter 
20 we shall learn that the effective weak Hamiltonian density for this low 
energy strangeness non-changing semileptonic transition is 


fila). = SE Vaal ale 


x [Bra (£) (1 — 15) Bela) + by, (2) (1 — 15)0,(2)] (18.47) 


> 
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where Gp is Fermi constant and Vaq is an element of the Cabibbo—Kobayashi-— 
Maskawa (CKM) matrix (see section 20.7.3). Thus the lowest-order contribu- 
tion to the S-matrix is 


zip SN gal / de Pw (x)|n*, p) 
_ ¡Gr Arlt: i , 
= -ip d x(u ,P1; Vu» P2| Py, (y, (1 B 5) Wp (a)|0) 


x (Oly (a) (1 — Y5)bu (2) |r, p). (18.48) 


The leptonic matrix element gives 3, (p2)y, (1 — Y5)0p (p1)e (P:+P2):2. For the 
pionic one, we note that 


hala)" — 78) ba) = Ft (a) Ús (o) Hs (a) + its (a) (18.49) 


from (18.20) and (18.21). Further, the currents j can have no matrix elements 
between the vacuum (which is a 0* state) and the m (which is 07), by the 
following argument. From Lorentz invariance such a matrix element has to 
be a 4-vector. But since the initial and final parities are different, it would 
have to be an axial 4-vector?. However, the only 4-vector available is the 
pion’s momentum p” which is an ordinary (not an axial) 4-vector. On the 
other hand, precisely for this reason the axial currents a ¿ do have a non-zero 


matrix element, as in (18.46). Noting that |7*) = 5|m + ima), we find that 


| 


i 4 ee 
As — jos) + img) (18.50) 


—V2p" fre iP * (18.51) 


(Olax) — 5) bu(a)|a*, p) 


II 


so that (18.48) becomes 


i(27)*8 (p1 + pa — p)[Gr Vaat (pa) yu (1 — Y5)0(p1)p" frl- (18.52) 


The quantity in brackets is, therefore, the invariant amplitude for the process, 
M. Using p = pı + p2, we may replace p in (18.52) by my, neglecting the 
neutrino mass. 

Before proceeding, we comment on the physics of (18.52). The (1 — 45) 
factor acting on a v spinor selects out the y, = —1 eigenvalue which, if the 
muon was massless, would correspond to positive helicity for the put (compare 
the discussion in section 12.4.2). Likewise, taking the (1 — y5) through the 
"Ph factor to act on ul, it selects the negative helicity neutrino state. Hence 
the configuration is as shown in figure 18.2, so that the leptons carry off a 
net spin angular momentum. But this is forbidden, since the pion spin is 
zero. Hence the amplitude vanishes for massless muons and neutrinos. Now 
the muon, at least, is not massless, and some ‘wrong’ helicity is present in 


3See chapter 4 of volume 1. 
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FIGURE 18.2 


Helicities of massless leptons in 7+ — u*v, due to the “V-A! interaction. 


its wavefunction, in an amount proportional to m,. This is why, as we have 
just remarked after (18.52), the amplitude is proportional to m,,. The rate is 
therefore proportional to mi. This is a very important conclusion, because it 
implies that the rate to muons is ~ (m,/me)? ~ (400)? times greater than 
the rate to electrons — a result which agrees with experiment, while grossly 
contradicting the naive expectation that the rate with the larger energy release 
should dominate. This, in fact, is one of the main indications for the ‘vector- 
axial vector’, or ‘V-A’, structure of (18.47), as we shall see in more detail in 
section 20.2. 
Problem 18.4 shows that the rate computed from (18.52) is 
22 £2 (2 2)2 
Pz pi e Ss (18.53) 
4rmă 

Including radiative corrections, the value 

fr = 92 MeV (18.54) 


can be extracted. 

Consider now another matrix element of de 5, this time between nucleon 
states. Following an analysis similar to that in section 8.8 for the matrix 
elements of the electromagnetic current operator between nucleon states, we 
write 


(N, 9 [it's (O)IN, p) 

fis p 5/,2 igt” FS? Mme F5(q2)| E 

= UP!) Fila) + sa) +a" Fs (a) Ful), 
(18.55) 


where the F?’s are certain form factors, M is the nucleon mass, and q = p—p’. 
The spinors in (18.55) are understood to be written in flavour and Dirac space. 
Since (with massless quarks) j;"; is conserved — that is q,,j/'s(0) = 0 — we find 


0 = arms PRA) + FAN Tal) 
= am) FA) + AN tal) 


= ap) -2My FF (4?) + 9753 (Pulp), (18.56) 
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FIGURE 18.3 
One pion intermediate state contribution to F3. 


using pys = —y5 p and the Dirac equations for u(p), u(p'). Hence the form 
factors FP and F$ must satisfy 


2M FP (q?) = q2F5(02). (18.57) 


Now the matrix element (18.55) enters into neutron P-decay (as does the 
matrix element of qe (0)). Here, q? = 0 and (18.57) appears to predict, there- 
fore, that either M = 0 (which is certainly not so) or F?(0) = 0. But F?(0) 
can be measured in 8 decay, and is found to be approximately equal to 1.26; it 
is conventionally called ga. The only possible conclusion is that F? must con- 
tain a part proportional to 1/q?. Such a contribution can only arise from the 
propagator of a massless particle — which, of course, is the pion. This elegant 
physical argument, first given by Nambu (1960), sheds a revealing new light 
on the phenomenon of spontaneous symmetry breaking: the existence of the 
massless particle coupled to the symmetry current i ¿ ‘saves’ the conservation 
of the current. 

We calculate the pion contribution to F? as follows. The process is pic- 
tured in figure 18.3. The pion-current matrix element is given by (18.46), 
and the (massless) propagator is i/q?. For the 7 — N vertex, the conventional 
Lagrangian is y 

igaNNTiN ysTi N, (18.58) 


which is SU(2)f-invariant and parity conserving since the pion field is a pseu- 
doscalar, and so is Nys N. Putting these pieces together, the contribution of 
figure 18.3 to the current matrix element is 


29NN Up )y5-y ul(p) = (=iq* fr), (18.59) 


and so 1 
F3(¢) = ann fr (18.60) 


from this contribution. Combining (18.57) with (18.60) we deduce 


ga = lm Fa?) = SNF, (18.61) 
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the Goldberger-Treiman (1958) relation. Taking M = 939 MeV, ga = 1.26 
and fr = 92 MeV one obtains genn œ 12.9, which is only 5% below the 
experimental value of this effective pion-nucleon coupling constant. 

We can repeat the argument leading to the G-T relation but retaining 
m2 #0. Equation (18.46) tells us that 9,1; /(m2 f,) behaves like a properly 
normalized pion field, at least when operating on a near mass-shell pion state. 
This means that the one-nucleon matrix element of ORA 5 is (cf (18.59)) 


ad Ti i 2 
29nNNU(p 195 UP) aa de (18.62) 


while from (18.55) it is given by 


ES Ti 
ia) -2My5E (4°) + EN) ul). (18.63) 
Hence 
29nNNM2 fa 
5/2 275/23 _ z 
—2M FÈ (q4) + ¢F3(¢°) = Ee * (18.64) 


Also, in place of (18.60) we now have 
3 (q ) 2 m2 INN Ír- d 


Equations (18.64) and (18.65) are consistent for q? = m2 if 
Epa? = m2) = Jann fr /M. (18.66) 


F? (q?) varies only slowly from q? = 0 to q? = m2, since it contains no rapidly 
varying pion pole contribution, and so we recover the G-T relation again. 

Amplitudes involving two Goldstone pions can be calculated by an exten- 
sion of these techniques. However, a much more efficient method is available, 
through the use of effective Lagrangians, which capture the low energy dy- 
namics of the Goldstone modes. 


IEEE E 
18.3 Effective Lagrangians 

18.3.1 The linear and non-linear o-models 

We begin by considering the linear o-model, which has the same Lagrangian 


as the one considered in section 17.6, 


Lo = (OLP NOG) + Wold — FOO), (18.67) 
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but we shall interpret it differently here. The sign of the u? term has been 
chosen to induce spontaneous symmetry breaking. In section 17.6, $ was the 
SU(2) doublet 


A (61 + ida) 
=f ya” E 
a ( ($s + iĝa) ) eo 


in terms of which (18.67) becomes 
p e a ee ee O ae a 
Loe = 7000 Da + 3” Papa — 36 (?a%a) > (18.69) 


where the sum on a = 1 to 4 is understood. Evidently (18.69) is invariant 
under transformations which preserve the ‘dot product’ baba; namely the 
transformations of SO(4). This group is discussed in appendix M, section 
M.4.3. We note there that the algebra of the generators of SO(4) is the same 
as that of SU(2) x SU(2), which is the algebra of the chiral charges in (18.28)- 
(18.30). This suggests that we should rewrite (18.69) in such a way as to reveal 
its SU(2)LxSU(2)r symmetry, rather than its O(4) symmetry. Three of the 
four fields will then be identified with the Goldstone bosons associated with 
the spontaneous breaking of the ‘R — L’ part; they will in turn be identified 
with the (massless) pions. 
One way to bring out the chiral symmetry of (18.69) is to write 


Ge +m) \ 1 (0) 
(a+ Eta : 18.70 
ý ( (6 — iĉs)/ v2 v2 1 
where 2 
Then i 
ig = TÈS), 18.72) 
and (18.69) becomes 
“sal atau Al rates. APA Aa 
laa 3 Tel, BHD) + he 5) — ga (> Dye: (18.73) 


This Lagrangian is invariant under the SU(2) x SU(2)r transformation 
yeu SL (18.74) 


where 
Uy = exp(-iar : 7/2), Un = exp(-ian : 7/2) (18.75) 


are two independent SU(2) transformations (remember that TrAB = TrBA). 
For the case of infinitesimal transformations, we find (problem 18.5) 


Q» 


> ó-n% (18.76) 
> RENOT+EXT, (18.77) 


>> 
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where 
n= (er a €L)/2, € = (er + eL)/2. (18.78) 
Evidently en = n + € and eL = e— n, which we may compare with the L and 
R transformation of the quark fields in (18.37), (18.38). 
With the sign of u? as in (18.73), the classical potential has a minimum at 


62 +È? = 42 [A =, (18.79) 
which we interpret as the symmetry breaking condition 
(062 + #?|0) = v?. (18.80) 
Let us choose the particular ground state 
(0|5]0) =v, (0|F]0) = 0, (18.81) 


which is actually the same as (17.103). Referring back to (18.76) and (18.77) 
we see that this vacuum is invariant under ‘L + R’ transformations with pa- 
rameters e, but not under ‘L — R’ transformations with parameters m. These 
correspond respectively to the SU(2)¢ flavour isospin, and SU(2)¢5 axial flavour 
isospin, transformations on the quark fields. So this vacuum spontaneously 
breaks the axial isospin symmetry. Fluctuations away from this minimum are 
described by fields 7 and § = ô — v. Placing this shift into (18.73) we find 
that Ls becomes Le where 

A da Ea iis IN a O 4 et A a2 4242 

Ls = 50480"S— ps" + 70,7 OY — Tus(s + 7°) —(5 47%), (18.82) 

2 2 4 16 

discarding an irrelevant constant. As expected, the field § is massive (with 
mass V2H), while the fields 7 are massless, and may be identified with the 
Goldstone modes associated with the spontaneous breaking of the axial isospin 
symmetry. 

The Lagrangian £, incorporates the correct symmetries, and can be used 
to calculate 7 — m scattering, for example (in the massless limit). But it is not 
the most efficient Lagrangian to use, as we can see from the following consid- 
erations. Consider the amplitude for t+ —7° scattering, in tree approximation 
(Donoghue et al. 1992). The contributing terms in £, are 

x A aa A „2 


e Dap et 
CS TS ) Gor”, (18.83) 


which we can rewrite in terms of the charged and neutral fields as 
==, A st 0 2y2 A sor 5 ~0 2 18.84 
1-a = Tel Tian ie) al Matte Arce A (18.84) 


Then the terms responsible for 7+ — 7° scattering at tree level are 


1 
-Zat aaa a= Aus (ta it ) (18.85) 
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The first of these represents a four-pion contact interaction with amplitude 
—i\/2, (18.86) 


while the second contributes an s-exchange graph in the t-channel with am- 
plitude 
i 
—iv /2)? ———— 18.87 
(Cu) za (18.87) 
where q is the 4-momentum transfer q = p, — p+ = po — pp. The sum of these 


1S 
2 


: q 


TE (18.88) 


which reduces to ig?/v? for q = 0. Thus, despite the apparent constant 4- 
boson piece (18.86), the total amplitude in fact vanishes as q? — 0, due to a 
cancellation. 

This cancellation is not an accident. It is generally true that Goldstone 
fields enter only via their derivatives, which bring factors of momenta into the 
amplitudes. We drew attention to this following equation (17.85), and the 
same is true of the 6 fields in (17.107). This suggests that it is both possible, 
and more efficient, to recast £, into a form in which only the derivatives of 
the Goldstone fields enter. Equation (17.107) indicates how to do this: we 
define new pion fields (but call them the same) by 


S=(v+S)U, U = explir - #/v), (18.89) 
where $ is invariant under SU(2),x SU(2)r, and where U transforms by 
Û > ULÛU}. (18.90) 


Now EY = (v + $)?, and the Goldstone modes have been transformed away 
from the potential terms in £y, reappearing in the derivative terms instead. 
We write the transformed Lagrangian as Ls where 


A 


. a puli sa | : ES 1 a 
és = 5048005 E E 308 — = 


S*, (18.91) 


where we have used 
Utara + HUI = 0, (18.92) 


which follows from the unitary condition Ut =1. 
When 0,U is expanded in powers of 7, we recover a kinetic energy piece 


sot Oa, (18.93) 


and all other terms involve derivatives of 7. In particular, the term with 
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the lowest number of derivatives which contributes to the m — m scattering 
amplitude is 

al -IEA OM) — HO, - OM FT], (18.94) 
since the S$ — # — fr vertex already has two derivatives. The reader may 
check that the amplitude for rtn? > 2+ 7° calculated from (18.94) is ig?/v?, 
exactly as before, but this time without having to go through the cancellation 
argument. 

The fields in Y on the one hand, and in Ê and U on the other, are related 
non-linearly, but a physical amplitude calculated with either representation 
has turned out to be the same, in this simple case. It is in fact generally 
true that such non-linear field redefinitions lead to the same physics (Haag 
1958, Coleman, Wess and Zumino 1969, Callan, Coleman, Wess and Zumino 
1969). It is clearly advantageous to work with És, which builds in the desired 
derivatives of the Goldstone modes. 

Indeed, we can simplify matters even further. Since S$ is invariant under 
SU(2)Lx SU(2)r, the full symmetry of the Lagrangian is maintained with only 
the field U, transforming by (18.90), and we may as well discard $ altogether. 
The dynamics of the Goldstone sector are then described by the non-linear 
o-model, with Lagrangian 


A 2 A A 
E 19,090!) (18.95) 


This is the most general Lagrangian that involves the Goldstone fields, exhibits 
the desired symmetry, and contains only two derivatives. 

Since Êv is invariant under the SU(2)1 xSU(2)r transformations (18.75), 
we can calculate the associated Noether currents (problem 18.6), obtaining 


iv? 


Tir, OU - (OX UYU} 5 = 


aaa io? 


Ji (U) = 


Tr(7;U0"Ut), (18.96) 


A iy? A A A A —i 2 A A 
JiR Ô) = Trl (OOO — 0100) = A Tn ÂtÀ). (18.97) 


The axial ‘R — L’ current is then 


A iv? 


Js(0) = Dir Uarut — 0190), (18.98) 


and the vector ‘R + L’ current is 


2 
(0) = Emir (000 + 0100), (18.99) 
Expanding (18.98) in powers of the pion field, we find 


JES(Ó) = va âi + ..., (18.100) 
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which we may compare with (17.93). Just as in equation (17.94), (18.100) 
implies that this axial current has a matrix element between the vacuum and 
the one-Goldstone state: 


(013; (Ô) |, p) = —ip ve P? 8j. (18.101) 


Now comes the pay-off: this is the same symmetry current which enters into 
weak interactions, for which we already defined the vacuum-to-one-particle 
matrix element in terms of the pion decay constant fa, via equation (18.46). 
Comparing (18.101) with (18.46) we identify 


v= fr. (18.102) 


Thus, finally, the dynamics of our massless pions, to lowest order in an ex- 
pansion in powers of momenta, is given by the Lagrangian 


= z f2Tr(0,00"U"). (18.103) 


It is quite remarkable that the low energy dynamics of the (massless) Gold- 
stone modes is completely determined in terms of one constant, measurable 
in 7 decay. 

The Lagrangian of (18.103) is an example of an effective Lagrangian. By 
this is meant, broadly, any Lagrangian which involves the presumed relevant 
degrees of freedom (here the Goldstone modes), and respects desired sym- 
metries of the theory. Evidently it is implied that there is some ‘underlying 
theory’, couched in terms of different degrees of freedom (here quarks and 
gluons), from which the symmetries have been abstracted. It is important 
to realize that an effective Lagrangian may or may not be renormalizable. 
Whereas our starting Lagrangian Lo is renormalizable, Ly is not: clearly the 
latter contains terms with arbitrarily many pion fields, which are operators 
of arbitrarily high dimension, compensated by negative powers of f?. As it 
stands, La can only be used at tree level — as, for example, in the calculation 
of m — m scattering using the interaction(18.94), for which the amplitude has 
an energy dependence of the form E?/f?, where E is the order of magnitude 
of the particles’ energy or momentum. This interaction has mass dimension 6, 
and its coupling 1/ f2 has dimension (mass)~?, like the 4-fermion interaction 
considered in section 11.8. It is therefore not renormalizable. However, the 
argument of section 11.8 suggests that a loop-by-loop renormalization pro- 
gramme is possible, and this was shown to be the case by Weinberg (1979a). 
Each loop built from the interaction (18.94) will carry an extra two powers 
of energy, to compensate the 1/f? in the coupling. Thus f, (or perhaps this 
multiplied by factors like 4 and 7, if we are lucky) provides the energy scale 
characteristic of a non-renormalizable theory: as we go up in energy, we need 
more loops. But, at each loop order new divergences appear, which require 
additional counter terms for renormalization. Thus at any given order in 
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E?/ f2, we must ensure that our effective Lagrangian contains all the appro- 
priate counter terms which are allowed by the symmetry. For example, at 
one-loop order for £2, we need to include the 4-derivative terms 


La = cı Tr(0,00"U010,U8"O") + coTr(0,U0,Uta"UavUt). (18.104) 


To perform a one-loop calculation, one uses L» at tree-level and in one-loop 
diagrams, and La at tree-level only. 

Real pions, however, are not massless, nor are real quarks. We need to 
extend our effective Lagrangian to include explicit chiral symmetry breaking 
mass terms. 


18.3.2 Inclusion of explicit symmetry breaking: masses for 
pions and quarks 


Consider the term y 

Êm, = m0 + Ut). (18.105) 
This is invariant only under the restricted set of transformations with ap = 
AR, that is transformations such that Up = Ur, for then TrU > Tr(URUU, ) = 
TrU. Such transformations form the SU(2) flavour isospin group. The term 
(18.105) breaks the axial isospin group explicitly, which would correspond 
to transformations with a, = —ap, or equivalently UL = UL, under which 
Û => VLÛUL. Expanding (18.105) to second order in the pion fields, we find 
the term 


5 1 
£Lquad,ma = — mar” (18.106) 


which, together with (18.93), shows that the pion field now has mass Mr. 
Higher-order terms can be added, m2 counting as equivalent to two deriva- 
tives. The low energy expansion is now an expansion in both the energy E 
and the pion mass ma. This is called chiral perturbation theory, or ChPT for 
short. 

For example, to calculate m — m scattering to order E?, we use La + Lrs 
at tree-level, expanded up to fourth power in the pion fields. The result is 
to change the amplitude for mt => m"? from i(p, — p+)2/f2 to i[(p!, — 
p+)? — m2]/f2. By considering the general 7 — m amplitude, predictions for 
the scattering lengths can be made for low energy observables, for example 
the s-wave scattering lengths ay and az in the isospin 0 and 2 channels. The 
results (first calculated by Weinberg 1966 using current algebra techniques) 
are 


Tm? m2 
=0.16m,!, = 4 18.1 
A -ggi = ~0.045m; (18.107) 
The experimental values are ag = 0.26 + 0.05 m7! and az = —0.028 + 


0.012 m=!, as given by Donoghue et al. (1992). The next order in ChPT 


To? 


improves upon these results. 
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A systematic exposition of ChPT at the one-loop level was given by Gasser 
and Leutwyler (1984). Bijnens et al. (1996) carried the 7 — 7 calculation to 
two-loop order; see also Colangelo et al. (2001). 

It is clear that there must be some relation between the masses of the u 
and d quarks (in the SU(2) flavour case) and the pion mass, since the latter 
must vanish in the limit my = mq = 0. To see this connection, we consider 
the quark mass term in the 2-flavour QCD Lagrangian, which is 


—d mo, ma = ( T = ) i (18.108) 
Let us now redefine the quark fields (compare (17.107) and (18.17)) by 


= exp[-it 75 /(2fx)] Ê. (18.109) 


Q> 


This transformation is a perfectly good parametrization of the Goldstone fields 
associated with the axial symmetry (18.17), and effectively removes them from 
the new fermion fields f. The quark mass term now becomes 


—f exp[—ir - &ys/(2fr)]] mo exp[—ir -*/(2401Í. (18.110) 


We now make the assumption that the axial SU(2) is spontaneously broken 
in QCD, by imposing a non-zero vev on the symmetry-breaking operator Î f A 


(Olf:f)10) =—J2B0 (ij = 12). (18.111) 


Expanding (18.110) up to second order in the pion fields, retaining just the 
expectation value of the fermion bilinear*, we find a mass term 


1 
—>Bimu + ma)’, (18.112) 


from which the relation (Gasser and Leutwyler 1982) 


m = = (off) (18.113) 


follows, where ff represents either ff, or fafa. From (18.113) we can see 
that the square of the pion mass is proportional to the average u-d quark mass 
(provided of course that B does not accidentally vanish), and goes to zero as 
they do; (0|f f|0) is the ‘chiral condensate’ (cf figure 18.1). 

Lattice QCD (see chapter 16) can be used to test equation (18.113), since 
simulations can be done for a range of quark masses, and the relation between 
m2 and mu,q can be checked. Conversely, ChPT can assist lattice QCD 
calculations by guiding the extrapolation of the calculated results to quark 


+A formal justification of this step is provided by Weinberg (1996), section 19.6. 
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mass values lighter than can presently be simulated. For example, Noaki et al. 
(2008) have reported the results of such a calculation, using 2 light dynamical 
quark flavours, in the overlap fermion formalism (Neuberger 1998a, 1998b), 
which preserves chiral symmetry at finite lattice spacing. Their pion masses 
ranged from 290 MeV to 750 MeV, and they compared their results with the 
predictions of ChPT at one-loop (Gasser and Leutwyler 1984) and two-loop 
(Colangelo et al. 2001). They found good fits to the ChPT formulae, and 
extracted quark masses (in the MS scheme at the scale 2 GeV) of about 4.5 
MeV; they also found |(0|ff|0) = (235 GeV)?, in the MS scheme at 2 GeV 
scale. Studies by this and other groups are continuing, with 3 light flavours, 
lighter pion masses, and other lattice fermion formalisms. 


18.3.3 Extension to SU(3)f xSU(3)rr 


To the extent that the strange quark is also ‘light’ on hadronic scales, the 
QCD Lagrangian has the larger symmetry of SU(3)f1 xSU(3)¢r, which breaks 
spontaneously so as to preserve the flavour symmetry SU(3)f, and produce 
an SU(3) octet of pseudoscalar Goldstone bosons: 1*, 7°, KE, KO, K? and ns 
(see figure 12.4). The effective Lagrangian approach to the dynamics of the 
Goldstone fields can be easily extended to chiral SU(3). One simply replaces 
U = exp(ir : #/fr) by V = exp(iA - bJ fr) where 


ne A a wai 1 ae la = 
Y 2, Aaa = = wan A veils = oie (18.114) 
as K K ride 
One easily verifies that the kinetic terms in 
, P ee 
La = ET VV (18.115) 


have the correct normalization, using TrAg\» = 26a). The 3-flavour quark 
mass term is now 


—f exp[-id- 75/(2fx)] ma exp[-id - 75/(2fn)] Ê (18.116) 
where 
Mu 0 0 
ma3=| 0 ma 0 |. (18.117) 
0 O ms 


The axial SU(3) symmetry breaking vev is 
(014,410) =—f2B8;; (i,j=1,2,3) (18.118) 
and the meson mass term is 


TA -H m3). (18.119) 
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This yields (problem 18.7) 


m2, = m2o = B(mu + ma), (18.120) 
m% = B(my + ms), (18.121) 
m2-o = B(ma + ms, ) (18.122) 
1 
Mp, = zB Ma + ma + 4ms), (18.123) 
and there is also a term which mixes 7° and ng: 
B 
ie = Jm — ma). (18.124) 


It is interesting that the charged and neutral pions have the same mass, even 
though we have made no assumption about the ratio of mu to ma. The 
observed pion mass differences arise from electromagnetism. 

If we ignore for the moment electromagnetic mass differences, we can de- 
duce from (18.120)-(18.122) the relation 


ma E Mu + Ma 


(18.125) 


2 2 
2Mk — mâ Ms 


The left-hand side is approximately equal to 0.04, so we learn that the non- 
strange quarks are about 1/25 times as heavy as the strange quark. We also 
obtain i 

Mp, = zamk —m?), (18.126) 
which is the Gell-Mann—Okubo formula for the (squared) masses of the pseu- 
doscalar meson octet (Gell-Mann 1961, Okubo 1962). Using average values 
for the K and 7 masses, the relation (18.126) predicts m2, = 566 MeV, quite 
close to the 7 (548 MeV). 

Further progress requires the inclusion of electromagnetic effects, since my 
and mg are themselves comparable to the electromagnetic mass differences. 
Including these effects, Weinberg (1996) estimates 

DA 0.050, — 


Ms Ms 


= 0.027; (18.127) 


see also Leutwyler (1996). Note that the d quark is almost twice as heavy as 
the u quark: according to QCD, the origin of SU(2) isospin symmetry is not 
that my © ma, but that both are very small compared with, say, Ayz. 

All the results we have given are subject to correction by the inclusion 
of higher-order effects in the ChPT expansion. In the case of chiral SU(3), 
the fourth-order Lagrangian Ê, contains 8 terms (Gasser and Leutwyler 1984, 
1985). Donoghue et al. (1992) give a clear exposition of ChPT to one-loop 
order. 
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18.4 Chiral anomalies 


In all our discussions of symmetries so far — unbroken, approximate, and 
spontaneously broken — there is one result on which we have relied, and never 
queried. We refer to Noether’s theorem, as discussed in section 12.3.1. This 
states that for every continuous symmetry of a Lagrangian, there is a cor- 
responding conserved current. We demonstrated this result in some special 
cases, but we have now to point out that while it is undoubtedly valid at 
the level of the classical Lagrangian and field equations, we did not inves- 
tigate whether quantum corrections might violate the classical conservation 
law. This can, in fact, happen and when it does the afflicted current (or its 
divergence) is said to be ‘anomalous’, or to contain an ‘anomaly’. General 
analysis shows that anomalies occur in renormalizable theories of fermions 
coupled to both vector and axial vector currents. We may therefore expect to 
find anomalies among the vector and axial vector flavour currents which we 
have been discussing. 

One way of understanding how anomalies arise is through consideration of 
the renormalization process, which is in general necessary once we get beyond 
the classical (‘tree level’) approximation. As we saw in volume 1, this will 
invariably entail some regularization of divergent integrals. But the specific 
example of the O(e?) photon self-energy studied in section 11.3 showed that 
a simple cut-off form of regularization already violated the current conserva- 
tion (or gauge invariance) condition (11.21). In that case, it was possible to 
find alternative regularizations which respected electromagnetic current con- 
servation, and were satisfactory. Anomalies arise when both axial and vector 
symmetry currents are present, since it is not possible to find a regulariza- 
tion scheme which preserves both vector and axial vector current conservation 
(Adler 1970, Jackiw 1972, Adler and Bardeen 1969). 

We shall not attempt an extended discussion of this technical subject. But 
we do want to alert the reader to the existence of these anomalies; to indicate 
how they arise in one simple model; and to explain why, in some cases, they 
are to be welcomed, while in others they must be eliminated. 

We consider the classic case of 7° — yy, in the context of spontaneously 
broken chiral flavour symmetry, with massless quarks and pions. The axial 
isospin current Te s(x) should then be conserved, but we shall see that this 
implies that the amplitude for 70 — yy must vanish, as first pointed out 
by Veltman (1967) and Sutherland (1967). We begin by writing the matrix 
element of J5 s(x) between the vacuum and a 2y state, in momentum space, 
as 


| ast, k1, €1; 7, k2, e213% 5 (æ)10) 


= (27)%64(h + ka — q)et, (k1)edy (k2) MA (kı, k2). (18.128) 
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FIGURE 18.4 
The amplitude considered in (18.128), and the one pion intermediate state 
contribution to it. 


As in figure 18.3, one contribution to M“”* has the form (constant /q2) due to 
the massless 7° propagator, shown in figure 18.4. This is, once again, because 
when chiral symmetry is spontaneously broken, the axial current connects the 
pion state to the vacuum, as described by the matrix element (18.46). The 
contribution of the process shown in figure 18.4 to MY"A is then 


ig" fr E Ae? ky a kop (18.129) 


where the 7° > yy amplitude is Acele (k1)ežy (k2)kiak2g. Note that this 
automatically incorporates electromagnetic gauge invariance (the amplitude 
vanishes when the polarization vector of either photon is replaced by its 4- 
momentum, due to the antisymmetry of the e symbol), and it is symmetrical 
under interchange of the photon labels. Now consider replacing iy s(x) in 
(18.128) by Oe aa) which should be zero. A partial integration then shows 
that this implies that 

quM” =0 (18.130) 


which with (18.129) implies that A = 0, and hence that 7° > yy is forbidden. 
It is important to realize that all other contributions to M“”, apart from 
the 7° one shown in figure 18.4, will not have the 1/q? factor in (18.129), and 
will therefore give a vanishing contribution to q„.MP”A at q? = 0 which is the 
on-shell point for the (massless) pion. 

It is of course true that m2 4 0. But estimates (Adler 1969) of the conse- 
quent corrections suggest that the predicted rate of 7° > yy for real 7°’s is 
far too small. Consequently, there is a problem for the hypothesis of sponta- 
neously broken (approximate) chiral symmetry. 

In such a situation it is helpful to consider a detailed calculation performed 
within a specific model. This is supplied by Itzykson and Zuber (1980), sec- 
tion 11.5.2; in essentials it is the same as the one originally considered by 
Steinberger (1949) in the first calculation of the n? — yy rate, and subse- 
quently by Bell and Jackiw (1969) and by Adler (1969). It employs (scalar) 
o and (pseudoscalar) 7° meson fields, augmented by a fermion of mass m 
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(a) j (b) ' 


FIGURE 18.5 
The two O(a) graphs contributing to 7° > yy decay. 


and charge +e, representing the proton. To order a, there are two graphs to 
consider, shown in figure 18.5(a) and (b). It turns out that the fermion loop 
integral is actually convergent. In the limit q? — 0 the result is 


e2 


A= we (18.131) 


where A is the 7% — yy amplitude introduced above. Problem 18.8 evaluates 
the 7° > yy rate using (18.131), to give 


(18.132) 


(18.132) is in very good agreement with experiment. 

In principle, various possibilities now exist. But a careful analysis of the 
‘triangle’ graph contributions to the matrix element M1”? of (18.128), shown 
in figure 18.6, reveals that the fault lies in assuming that a regularization exists 
such that for these amplitudes the conservation equation q,.(yy/34 5(0)|0) = 0 
can be maintained, at the same time as electromagnetic gauge variance. In 
fact, no such regularization can be found. When the amplitudes of figure 
18.6 are calculated using an (electromagnetic) gauge invariant procedure, one 
finds a non-zero result for qq (oră (0)]0) (again the details are given in Itzyk- 
son and Zuber (1980)). This implies that 9,7% ¿(w) is not zero after all, the 
calculation producing the specific value | 


2 
3 e av P A 
OnJ3,5(@) = — 3272 Pà Fav Fax (18.133) 


27 

where the F’s are the usual electromagnetic field strengths. 
Equation (18.133) means that (18.130) is no longer valid, so that A need 
no longer vanish: indeed, (18.133) predicts a definite value for A, so we need to 
see if it is consistent with (18.131). Taking the vacuum > 2y matrix element 
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FIGURE 18.6 
O(a) contributions to the matrix element in (18.128). 


of (18.133) produces (problem 18.9) 


2 
iq M> = Lae kakap (18.134) 


which is indeed consistent with (18.128) and (18.131), after suitably inter- 
changing the labels on the e symbol. 

Equation (18.133) is therefore a typical example of ‘an anomaly’ — the 
violation, at the quantum level, of a symmetry of the classical Lagrangian. It 
might be thought that the result (18.133) is only valid to order a (though the 
O(a?) correction would presumably be very small). But Adler and Bardeen 
(1969) showed that such ‘triangle’ loops give the only anomalous contributions 
to the di — y — y vertex, so that (18.133) is true to all orders in a. 

The triangles considered above actually used a fermion with integer charge 
(the proton). We clearly should use quarks, which carry fractional charge. 
In this case, the previous numerical value for A is multiplied by the factor 
13Q? for each contributing quark. For the u and d quarks of chiral SU(2) x 
SU(2), this gives 1/3. Consequently agreement with experiment is lost unless 
there exist three replicas of each quark, identical in their electromagnetic and 
SU(2) x SU(2) properties. Colour supplies just this degeneracy, and thus the 
n° — yy rate is important evidence for such a degree of freedom, as we noted 
in chapter 14. 

In the foregoing discussion, the axial isospin current was associated with a 
global symmetry; only the electromagnetic currents (in the case of 70 > yy) 
were associated with a local (gauged) symmetry, and they remained conserved 
(anomaly free). If, however, we have an anomaly in a current associated with 
a local symmetry, we will have a serious problem. The whole rather elaborate 
construction of a quantum gauge field theory relies on current conservation 
equations such as (11.21) or (13.130) to eliminate unwanted gauge degrees 
of freedom, and ensure unitarity of the S-matrix. So anomalies in currents 
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coupled to gauge fields cannot be tolerated. As we shall see in chapter 20, 
and is already evident from (18.48), axial currents are indeed present in weak 
interactions and they are coupled to the Wt, Z° gauge fields. Hence, if this 
theory is to be satisfactory at the quantum level, all anomalies must somehow 
cancel away. That this is possible rests essentially on the observation that the 
anomaly (18.133) is independent of the mass of the circulating fermion. Thus 
cancellations are in principle possible between quark and lepton ‘triangles’ 
in the weak interaction case. Bouchiat et al. (1972) were the first to point 
out that, for each generation of quarks and leptons, the anomalies will cancel 
between quarks and leptons if the fractionally charged quarks come in three 
colours. The condition that anomalies cancel in the gauged currents of the 
Standard Model is the remarkably simple one (Ryder 1996, p384): 


Ne(Qu + Qa) + Qe = 0 (18.135) 


where N. is the number of colours and Qu, Qa and Qe are the charges (in units 
of e) of the ‘u’, ‘d’, and ‘e’ type fields in each generation. Clearly (18.135) 
is true for each generation of the Standard Model; the condition indicates 
a remarkable connection, at some deep level, between the facts that quarks 
come in three colours and have charges which are 1/3 fractions. The Standard 
Model provides no explanation for this connection. Anomaly cancellation is a 
powerful constraint on possible theories (°t Hooft 1980, Weinberg 1996 section 
22.4). 


= 
Problems 

18.1 Verify (18.24)-(18.26). 

18.2 Verify (18.28)-(18.30). 

18.3 Show that £, of (18.12) can be written as (18.39). 


18.4 Show that the rate for n — putu, calculated from the lowest-order 
matrix element (18.52), is given by (18.53). 


18.5 Verify the transformation equations (18.76) and (18.77). 
18.6 


(a) Consider a Lagrangian Lló», Ou dr) where the 6, could be either bosonic 
or fermionic fields. Let the fields transform by an infinitesimal local 
(x-dependent) transformation 


br > br — ical) TEQs (sum on s). 
Show that the change in Ê may be written as 


SÊ = 3% (0)Ouea(2) + ca(0)0, j (2) 
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(b) 
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where al ƏS) 
Sap a) = —iT@ 2 OL = EA e 
J% (x) Tr sPs (0.9) 0(9, € (2))) 
and ñ 
Ong" (a) = a (1) 


Deduce that if £ is invariant under the global form of this transforma- 
tion (i.e. constant €a), then the current defined by (1) is conserved. 
[This procedure for finding conserved currents for global symmetries 
is due to Gell-Mann and Levy (1960).] 


Apply the method of part (a) to verify the form of the currents (18.96) 
and (18.97). 


18.7 Verify equations (18.120)-(18.124). 
18.8 Verify (18.132), and calculate the 7° lifetime in seconds. 
18.9 Verify (18.134). 
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Spontaneously Broken Local Symmetry 


In earlier parts of this book we have briefly indicated why we might want to 
search for a gauge theory of the weak interactions. The reasons include: (i) 
the goal of unification (e.g. with the U(1) gauge theory QED), as mentioned in 
section 1.3.5; and (ii) certain ‘universality’ phenomena (to be discussed more 
fully in chapter 20), which are reminiscent of a similar situation in QED (see 
comment (ii) in section 2.6, and also section 11.6), and which are particularly 
characteristic of a non-Abelian gauge theory, as pointed out in section 13.1 
after equation (13.44). However, we also know from section 1.3.5 that weak 
interactions are short-ranged, so that their mediating quanta must be massive. 
At first sight, this seems to rule out the possibility of a gauge theory of weak 
interactions, since a simple gauge boson mass violates gauge invariance, as we 
pointed out for the photon in section 11.3 and for non-Abelian gauge quanta 
in section 13.3.1, and will review again in the following section. Nevertheless, 
there is a way of giving gauge field quanta a mass, which is by ‘spontaneously 
breaking’ the gauge (i.e. local) symmetry. This is the topic of the present 
chapter. The detailed application to the electroweak theory will be made in 
chapter 21. 


EE: SeSe 


19.1 Massive and massless vector particles 


Let us begin by noting an elementary (classical) argument for why a gauge 
field quantum cannot have mass. The electromagnetic potential satisfies the 
Maxwell equation (cf (2.22)) 


A” — 8 (pA) = fh (19.1) 


which, as discussed in section 2.3, is invariant under the gauge transformation 
Al — Al! = Ah — Oly, (19.2) 


However, if A“ were to represent a massive field, the relevant wave equation 
would be 


(0 + MPA” — 0" (0, A") = ju. (19.3) 
To get this, we have simply replaced the massless ‘Klein—Gordon’ operator 
by the corresponding massive one, O + M? (compare sections 3.1 and 5.3). 
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FIGURE 19.1 
Fermion-fermion scattering via exchange of two X bosons. 


Equation (19.3) is manifestly not invariant under (19.2), and it is precisely 
the mass term M?A” that breaks the gauge invariance. The same conclusion 
follows in a Lagrangian treatment; to obtain (19.3) as the corresponding Euler- 
Lagrange equation, one adds a mass term +3M 2 A,A” to the Lagrangian of 
(7.66) (see also sections 11.4 and 13.3.1), and this clearly violates invariance 
under (19.2). Similar reasoning holds for the non-Abelian case too. Perhaps, 
then, we must settle for a theory involving massive charged vector bosons, 
W= for example, without it being a gauge theory. 

Such a theory is certainly possible, but it will not be renormalizable, as 
we now discuss. Consider figure 19.1, which shows some kind of fermion- 
fermion scattering (we need not be more specific), proceeding in fourth order of 
perturbation theory via the exchange of two massive vector bosons, which we 
will call X-particles. To calculate this amplitude, we need the propagator for 
the X-particle, which can be found by following the ‘heuristic’ route outlined in 
section 7.3.2 for photons. We consider the momentum-space version of (19.3) 
for the corresponding X” field, but without the current on the right-hand side 
(so as to describe a free field): 


[(—k? + M?)\g’" + k”k”]X,„(k) = 0, (19.4) 


which should be compared with (7.90). Apart from the ‘ie’, the propagator 
should be proportional to the inverse of the quantity in the square brackets 
in (19.4). Problem 19.1 shows that unlike the (massless) photon case, this 
inverse does exist, and is given by 


— gh + ktk” /M? 


n (19.5) 


A proper field-theoretic derivation would yield this result multiplied by an 
overall factor ‘i’ as usual, and would also include the ‘ie’ via k? — M? > 
k? — M? + ie. We remark immediately that (19.5) gives nonsense in the limit 
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M — 0, thus indicating already that a massless vector particle seems to be a 
very different kind of thing from a massive one (we can’t just take the massless 
limit of the latter). 

Now consider the loop integral in figure 19.1. At each vertex we will have 
a coupling constant g, associated with an interaction Lagrangian having the 
general form gb yx H (a vu coupling could also be present but will not 
affect the argument). Just as in QED, this ‘g’ is dimensionless but, as we 
warned the reader in section 11.8, this may not guarantee renormalizability, 
and indeed this is a case where it does not. To get an idea of why this might 
be so, consider the leading divergent behaviour of figure 19.1. This will be 
associated with the kk” terms in the numerator of (19.5), so that the leading 
divergence is effectively 


k” k” kPk7\ 11 
~ | dtk —— 19.6 
ME) ie 
for high k-values (we are not troubling to get all the indices right, we are 
omitting the spinors altogether, and we are looking only at the large k part 


of the propagators). Now the first two bracketed terms in (19.6) behave like 
a constant at large k, so that the divergence is effectively 


ru 4 — = 
fa ky 7 (19.7) 


which is quadratically divergent, and indeed exactly what we would get in a 
‘four-fermion’ theory — see (11.98) for example. This strongly suggests that 
the theory is non-renormalizable. 

Where have these dangerous powers of k in the numerator of (19.6) come 
from? The answer is simple and important. They come from the longitudinal 
polarization state of the massive X-particle, as we shall now explain. The 
free-particle wave equation is 


(0 + M2)X” — 9(9,X*) =0 (19.8) 


and plane wave solutions have the form 
Xe eae, (19.9) 
Hence the polarization vectors e” satisfy the condition 
(—k? + Me" + k"k, e! = 0. (19.10) 
Taking the ‘dot’ product of (19.10) with k, leads to 
M?*k-e=0, (19.11) 


which implies (for M? # 0!) 
k-e=0. (19.12) 
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Equation (19.12) is a covariant condition, which has the effect of ensuring 
that there are just three independent polarization vectors, as we expect for a 
spin-1 particle. Let us take k” = (49,0,0,|k]); then the z- and y-directions 
are ‘transverse’ while the z-direction is ‘longitudinal’. Now, in the rest frame 
of the X, such that krest = (M,0,0,0), (19.12) reduces to e = 0, and we may 
choose three independent e’s as 


e! (krest, A) = (0, €(A)) (19.13) 

with 
e(A = +1) = +2 (1, +i,0) (19.14) 
e(\=0) = (0,0,1). (19.15) 


The e’s are ‘orthonormalized’ so that (cf (7.86)) 
e(A)* 8 e(A) = Oxy. (19.16) 


These states have definite spin projection (A = +1,0) along the z-axis. For 
the result in a general frame, we can Lorentz transform e” (krest, A) as required. 
For example, in a frame such that k” = (k°,0,0,|k|), we find 


e” (k, A = +1) = e” (krest, A = +1) (19.17) 
as before, but the longitudinal polarization vector becomes (problem 19.2) 
e! (k, A = 0) = M~*(|k|, 0,0, k°). (19.18) 


Note that k - e” (k, A = 0) = 0 as required. 
From (19.17) and (19.18) it is straightforward to verify the result (problem 
19.3) 
XO (k, Ae" (k,A) = —g + kk” /M?. (19.19) 
A=0,+1 


Consider now the propagator for a spin-1/2 particle, given in (7.63): 


i(k +m) 


ate (19.20) 


Equation (7.64) shows that the factor in the numerator of (19.20) arises from 
the spin sum 


X ualk, s)tig(k, s) = (K+ m)ag. (19.21) 


In just the same way, the massive spin-1 propagator is given by 


ig + kk” /M?] 


Saga (19.22) 
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the numerator in (19.22) arising from the spin sum (19.19). Thus the danger- 
ous factor k*k"/M? can be traced to the spin sum (19.19): in particular, at 
large values of k the longitudinal state e*(k, A = 0) is proportional to k”, and 
this is the origin of the numerator factors k“k”/M? in (19.22). 

We shall not give further details here (see also section 22.1.2), but merely 
state that theories with massive charged vector bosons are indeed non-renor- 
malizable. Does this matter? In section 11.8 we explained why it is thought 
that the relevant theories at presently accessible energy scales should be renor- 
malizable theories. And, apart from anything else, they are much more pre- 
dictive. Is there, then, any way of getting rid of the offending 'k*k”” terms 
in the X-propagator, so as (perhaps) to render the theory renormalizable? 
Consider the photon propagator of chapter 7 repeated here: 


. v v J2 
e Mi N (19.23) 
k? + ie 
This contains somewhat similar factors of k“k” (admittedly divided by k? 
rather than M?), but they are gauge-dependent, and can in fact be ‘gauged 
away’ entirely, by choice of the gauge parameter € (namely by taking € = 1). 
But, as we have seen, such ‘gauging’ — essentially the freedom to make gauge 
transformations — seems to be possible only in a massless vector theory. 

A closely related point is that, as section 7.3.1 showed, free photons exist 
in only two polarization states (electromagnetic waves are purely transverse), 
instead of the three we might have expected for a vector (spin-1) particle — 
and as do indeed exist for massive vector particles. This gives another way 
of seeing in what way a massless vector particle is really very different from 
a massive one: the former has only two (spin) degrees of freedom, while the 
latter has three, and it is not at all clear how to ‘lose’ the offending longitudinal 
state smoothly (certainly not, as we have seen, by letting M — 0 in (19.5)). 

These considerations therefore suggest the following line of thought: is it 
possible somehow to create a theory involving massive vector bosons, in such 
a way that the dangerous k*k” term can be ‘gauged away’, making the theory 
renormalizable? 'The answer is yes, via the idea of spontaneous breaking of 
gauge symmetry. This is the natural generalization of the spontaneous global 
symmetry breaking considered in chapter 17. By way of advance notice, the 
crucial formula is (19.74) for the propagator in such a theory, which is to be 
compared with (19.22). 

The first serious challenge to the then widely held view that electro- 
magnetic gauge invariance requires the photon to be massless was made by 
Schwinger (1962), as we pointed out in section 11.4. Soon afterwards, Ander- 
son (1963) argued that several situations in solid state physics could be inter- 
preted in terms of an effectively massive electromagnetic field. He outlined a 
general framework for treating the phenomenon of the acquisition of mass by 
a gauge boson, and discussed its possible relevance to contemporary attempts 
(Sakurai 1960) to interpret the recently discovered vector mesons (p,w,@...) 
as the gauge quanta associated with a local extension of hadronic flavour sym- 
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metry. From his discussion, it is clear that Anderson had his doubts about 
the hadronic application, precisely because, as he remarked, gauge bosons can 
only acquire a mass if the symmetry is spontaneously broken. This has the 
consequence, as we saw in chapter 17, that the multiplet structure ordinarily 
associated with a non-Abelian symmetry would be lost. But we know that 
flavour symmetry, even if admittedly not exact, certainly leads to identifiable 
multiplets, which are at least approximately degenerate in mass. It was Wein- 
berg (1967) and Salam (1968) who made the correct application of these ideas, 
to the generation of mass for the gauge quanta associated with the weak force. 
There is, however, nothing specifically relativistic about the basic mechanism 
involved, nor need we start with the non-Abelian case. In fact, the physics 
is well illustrated by the non-relativistic Abelian (i.e. electromagnetic) case 
— which is nothing but the physics of superconductivity. Our presentation is 
influenced by that of Anderson (1963). 


ra SeSe 


19.2 The generation of ‘photon mass’ in a super- 
conductor: Ginzburg-Landau theory and 
the Meissner effect 


In chapter 17, section 17.7, we gave a brief introduction to some aspects 
of the BCS theory of superconductivity. We were concerned mainly with 
the nature of the BCS ground state, and with the non-perturbative origin 
of the energy gap for elementary excitations. In particular, as noted after 
(17.128), we omitted completely all electromagnetic couplings of the electrons 
in the ‘microscopic’ Hamiltonian. It is certainly possible to complete the BCS 
theory in this way, so as to include within the same formalism a treatment of 
electromagnetic effects (e.g. the Meissner effect) in a superconductor. We refer 
interested readers to the book by Schrieffer (1964), chapter 8. Instead, we shall 
follow a less ‘microscopic’ and somewhat more ‘phenomenological’ approach, 
which has a long history in theoretical studies of superconductivity, and is in 
some ways actually closer (at least formally) to our eventual application in 
particle physics. 


In section 17.3.1 we introduced the concept of an ‘order parameter’, a 
quantity which was a measure of the ‘degree of ordering’ of a system below 
some transition temperature. In the case of superconductivity, the order pa- 
rameter (in this sense) is taken to be a complex scalar field 4, as originally 
proposed by Ginzburg and Landau (1950), well before the appearance of BCS 
theory. Subsequently, Gorkov (1959) and others showed how the Ginzburg- 
Landau description could be derived from BCS theory, in certain domains of 
temperature and magnetic field. This work all relates to static phenomena. 
More recently, an analogous ‘effective theory’ for time-dependent phenomena 
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(at zero temperature) has been derived from a BCS-type model (Aitchison et 
al. 1995). For the moment, we shall follow a more qualitative approach. 

The Ginzburg-Landau field ~ is commonly referred to as the ‘macroscopic 
wave function’. This terminology originates from the recognition that in the 
BCS ground state a macroscopic number of Cooper pairs have ‘condensed’ 
into the state of lowest energy, a situation similar to that in the Bogoliubov 
superfluid. Further, this state is highly coherent, all pairs having the same 
total momentum (namely zero, in the case of (17.140)). These considerations 
suggest that a successful phenomenology can be built by invoking the idea of 
a macroscopic wavefunction 4, describing the condensate. Note that y is a 
‘bosonic’ quantity, referring essentially to paired electrons. Perhaps the single 
most important property of y is that it is assumed to be normalized to the 
total density of Cooper pairs n. via the relation 


[y]? = ne = ns/2 (19.24) 


where ns is the density of superconducting electrons. The quantities n. and 
ns will depend on temperature T, tending to zero as T approaches the su- 
perconducting transition temperature Te from below. The precise connection 
between yw and the microscopic theory is indirect; in particular, y has no 
knowledge of the coordinates of individual electron pairs. Nevertheless, as an 
‘empirical’ order parameter, it may be thought of as in some way related to 
the ground state ‘pair’ expectation value introduced in (17.121): in particular, 
the charge associated with w is taken to be —2e, and the mass is 2me. 

The Ginzburg-Landau description proceeds by considering the quantum- 
mechanical electromagnetic current associated with w, in the presence of a 
static external electromagnetic field described by a vector potential A. This 
current was considered in section 2.4, and is given by the gauge-invariant form 
of (A.7), namely 


[y*(V + 2ieA)w — {(V + 2icA) yp} *y]. (19.25) 


: —2e 
Jem = Amel 
Note that we have supplied an overall factor of —2e to turn the Schrédinger 
‘number density’ current into the appropriate electromagnetic current. As- 
suming now that, consistently with (19.24), w is varying primarily through 
its phase degree of freedom ¢, rather than its modulus ||, we can rewrite 
(19.25) as 
2e? 


; 1 
fon =-E (44 vo) [oP (19.26) 


e 


where 4 = e!®|7)|. We easily verify that (19.26) is invariant under the gauge 
transformation (2.41), which can be written in this case as 


A > A+Vx (19.27) 
ọ > (—2ex. (19.28) 
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We now replace |4|? in (19.26) by ns/2 in accordance with (19.24), and take 
the curl of the resulting equation to obtain 


2 
Vx den == (=) B. (19.29) 
Me 
Equation (19.29) is known as the London equation (London 1950), and is one 
of the fundamental phenomenological relations in superconductivity. 
The significance of (19.29) emerges when we combine it with the (static) 
Maxwell equation 
VxB=jm (19.30) 


Taking the curl of (19.30), and using V x (V x B) = V(V-B)- V’B and 
V - B = 0, we find 


2 

v?B = (=) B. (19.31) 
The variation of magnetic field described by (19.31) is a very characteristic 
one encountered in a number of contexts in condensed matter physics. First, 


we note that the quantity (e2ns/me) must — in our units — have the dimensions 
of (length)~?, by comparison with the left-hand side of (19.31). Let us write 


en, 1 
( ze ) => (19.32) 


Next, consider for simplicity one-dimensional variation 


dB 1 


in the half-plane x > 0, say. Then the solutions of (19.33) have the form 
B(x) = Bo exp—(a/d); (19.34) 


the exponentially growing solution is rejected as unphysical. The field there- 
fore penetrates only a distance of order A into the region x > 0. The range 
parameter A is called the screening length. This expresses the fact that, in a 
medium such that (19.29) holds, the magnetic field will be ‘screened out’ from 
penetrating further into the medium. 

The physical origin of the screening is provided by Lenz’s law: when a 
magnetic field is applied to a system of charged particles, induced EMFs are 
set up which accelerate the particles, and the magnetic effect of the resulting 
currents tends to cancel (or screen) the applied field. On the atomic scale this 
is the cause of atomic diamagnetism. Here the effect is occurring on a macro- 
scopic scale (as mediated by the ‘macroscopic wavefunction’ y), and leads to 
the Meissner effect — the exclusion of flux from the interior of a superconduc- 
tor. In this case, screening currents are set up within the superconductor, 
over distances of order A from the exterior boundary of the material. These 
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exactly cancel — perfectly screen — the applied flux density in the interior. 
With ns ~ 4 x 102% m~? (roughly one conduction electron per atom) we find 


mm 
A= ( =) x 1078 m, (19.35) 
nse 


which is the correct order of magnitude for the thickness of the surface layer 
within which screening currents flow, and over which the applied field falls to 
zero. As T — Te, ns — 0 and A becomes arbitrarily large, so that flux is no 
longer screened. 

It is quite simple to interpret equation (19.31) in terms of an ‘effective 
non-zero photon mass’. Consider the equation (19.8) for a free massive vector 
field. Taking the divergence via ô, leads to 


M?’ X” =0 (19.36) 


(cf (19.11)), and so (19.8) can be written as 


(+ M?)X” =0, (19.37) 


which simply expresses the fact that each component of X” has mass M. Now 
consider the static version of (19.37), in the rest frame of the X-particle in 
which (see equation (19.13)) the v = 0 component vanishes. Equation (19.37) 
reduces to 

V?X = M2X (19.38) 


which is exactly the same in form as (19.31) (if X were the electromagnetic 
field A, we could take the curl of (19.38) to obtain (19.31) via B = V x A). 
The connection is made precise by making the association 


M? = (=) zel, (19.39) 


Me 


Equation (19.39) shows very directly another way of understanding the ‘screen- 
ing length + photon mass’ connection: in our units h = c = 1, a mass has 
the dimension of an inverse length, and so we naturally expect to be able to 
interpret AT} as an equivalent mass (for the photon, in this case). 

The above treatment conveys much of the essential physics behind the 
phenomenon of ‘photon mass generation’ in a superconductor. In particular, 
it suggests rather strongly that a second field, in addition to the electromag- 
netic one, is an essential element in the story (here, it is the y field). This 
provides a partial answer to the puzzle about the discontinuous change in 
the number of spin degrees of freedom in going from a massless to a massive 
gauge field: actually, some other field has to be supplied. Nevertheless, many 
questions remain unanswered so far. For example, how is all the foregoing 
related to what we learned in chapter 17 about spontaneous symmetry break- 
ing? Where is the Goldstone mode? Is it really all gauge invariant? And 
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what about Lorentz invariance? Can we provide a Lagrangian description of 
the phenomenon? The answers to these questions are mostly contained in the 
model to which we now turn, which is due to Higgs (1964) and is essentially 
the local version of the U(1) Goldstone model of section 17.5. 


E: SeSe 


19.3 Spontaneously broken local U(1) symmetry: 
the Abelian Higgs model 


This model is just La of (17.69) and (17.77), extended so as to be locally, 
rather than merely globally, U(1) invariant. Due originally to Higgs (1964), it 
provides a deservedly famous and beautifully simple model for investigating 
what happens when a gauge symmetry is spontaneously broken. 

To make (17.69) locally U(1) invariant, we need only replace the 0’s by 
D’s according to the rule (7.123), and add the Maxwell piece. This produces 


En = ((0" +10") A On HigA Â] -ÊP EAG +4(6"8). (19.40) 


(19.40) is invariant under the local version of (17.72), namely 
d(x) > $ (a) = eitle) (19.41) 


when accompanied by the gauge transformation on the potentials 


At (2) > A’ (x) = Â! (£) + O) (19.42) 


Before proceeding any further, we note at once that this model contains four 
field degrees of freedom — two in the complex scalar Higgs field d, and two in 
the massless gauge field A“. 

We learned in section 17.5 that the form of the potential terms in (19.40) 
(specifically the u? one) does not lend itself to a natural particle interpreta- 
tion, which only appears after making a ‘shift to the classical minimum’, as 
in (17.84). But there is a remarkable difference between the global and local 
cases. In the present (local) case, the phase of b is completely arbitrary, since 
any change in @ of (19.41) can be compensated by an appropriate transfor- 
mation (19.42) on A“, leaving Lp the same as before. Thus the field Ê in 
(17.84) can be ‘gauged away’ altogether, if we choose! But Ê was precisely the 
Goldstone field, in the global case. This must mean that there is somehow 
no longer any physical manifestation of the massless mode. This is the first 
unexpected result in the local case. We may also be reminded of our desire to 
‘gauge away’ the longitudinal polarization states for a ‘massive gauge’ boson: 
we shall return to this later. 
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However, a degree of freedom (the Goldstone mode) cannot simply dis- 
appear. Somehow the system must keep track of the fact that we started 
with four degrees of freedom. To see what is going on, let us study the field 
equation for A’, namely 


Â — o” (ð AY) (19.43) 


= jea 


where 7%, is the electromagnetic current contained in (19.40). This current 
can be obtained just as in (7.141), and is given by 


Jem = igl G- (OGG) — 2g A’ bt. (19.44) 
We now insert the field parametrization (cf (17.84)) 


Pi 1 
p(z) = Va 


into (19.44) where v/V/2 = 21/2|u|/A% is the position of the minimum of the 
classical potential as a function of ||, as in (17.81). We obtain (problem 19.4) 


(v + h(a)) expió(2)/v) (19.45) 


4 a, aô 
Ji, = vq? (+ — ) +terms quadratic and cubic in the fields. (19.46) 
vq 


The linear part of the right-hand side of (19.46) is directly analogous to the 
non-relativistic current (19.26), interpreting Ê as essentially playing the role of 
6, and ||? the role of v?. Retaining just the linear terms in (19.46) (the others 
would appear on the right-hand side of equation (19.47) following, where they 
would represent interactions), and placing this 7%, in (19.43), we obtain 


— 99, Ah = =v? (+ 32 ") . (19.47) 
vq 


Now a gauge transformation on Â” has the form shown in (19.42), for arbitrary 
â. So we can certainly regard the whole expression (A”—9”0/vq) as a perfectly 
acceptable gauge field. Let us define 


E 0 
vq 


(19.48) 


Then, since we know that the left-hand side of (19.47) is invariant under 
(19. 49), the resulting equation for A’” is 


Â” — 8ra Â" = ve A”, (19.49) 


or 


$ 

£ 

Q 
Q 
< 
Q&Q 
D 
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II 
© 


(19.50) 
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But (19.50) is nothing but the equation (19.8) for a free massive vector field, 
with mass M = vq! This fundamental observation was first made, in the 
relativistic context, by Englert and Brout (1964), Higgs (1964), and Guralnik 
et al. (1964); for a full account, see Higgs (1966). 

The foregoing analysis shows us two things. First, the current (19.46) is 
indeed a relativistic analogue of (19.26), in that it provides a ‘screening’ (mass 
generation) effect on the gauge field. Second, equation (19.48) shows how the 
phase degree of freedom of the Higgs field ¢ has been incorporated into a new 
gauge field Â”, which is massive, and therefore has ‘three’ spin degrees of 
freedom. In fact, we can go further. If we imagine plane wave solutions for 
A”, AY and 6, we see that the 00/vq part of (19.48) will contribute something 
proportional to k”/M to the polarization vector of AY (recall M = vq). But 
this is exactly the (large k) behaviour of the longitudinal polarization vector of 
a massive vector particle. We may therefore say that the massless gauge field 
AY has ‘swallowed’ the Goldstone field 6 via (19.48) to make the massive vector 
field AY. The Goldstone field disappears as a massless degree of freedom, and 
reappears, via its gradient, as the longitudinal part of the massive vector field. 
In this way the four degrees of freedom are all now safely accounted for: three 
are in the massive vector field A”, and one is in the real scalar field h (to 
which we shall return shortly). 

In this (relativistic) case, we know from Lorentz covariance that all the 
components (transverse and longitudinal) of the vector field must have the 
same mass, and this has of course emerged automatically from our covariant 
treatment. But the transverse and longitudinal degrees of freedom respond 
differently in the non-relativistic (superconductor) case. There, the longitu- 
dinal part of A couples strongly to longitudinal excitations of the electrons: 
primarily, as Bardeen (1957) first recognized, to the collective density fluctu- 
ation mode of the electron system — that is, to plasma oscillations. This is 
a high frequency mode, and is essentially the one discussed in section 17.3.2, 
after equation (17.46). When this aspect of the dynamics of the electrons is 
included, a fully gauge invariant description of the electromagnetic proper- 
ties of superconductors, within the BCS theory, is obtained (Schreiffer 1964, 
chapter 8). 

We return to equations (19.48)—(19.50). Taking the divergence of (19.50) 
leads, as we have seen, to the condition 


ðL” =0 (19.51) 


on A“. It follows that in order to interpret the relation (19.48) as a gauge 
transformation on Â” we must, to be consistent with (19.51), regard Â” as 
being in a gauge specified by 


A A E 
9, Ar = +06 = Log, (19.52) 
vq 


M 


In going from the situation described by Â! and Ê to one described by At 
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alone via (19.48), we have evidently chosen a gauge function (cf (19.42)) 
&(x) = Ou) /v. (19.53) 
Recalling then the form of the associated local phase change on 4(2) 
G(x) + (2) = eG (a) (19.54) 


we see that the phase of ¢ in (19.45) has been reduced to zero, in this choice 
of gauge. Thus it is indeed possible to ‘gauge Ê away’ in (19.45), but then 
the vector field we must use is A’, satisfying the massive equation (19.50) 
(ignoring other interactions). In superconductivity, the choice of gauge which 
takes the macroscopic wavefunction to be real (i.e. ¢ = 0 in (19.26)) is called 
the ‘London gauge’. In the next section we shall discuss a subtlety in the 
argument which applies in the case of real superconductors, and which leads 
to the phenomenon of flux quantization. 

The fact that this ‘Higgs mechanism’ leads to a massive vector field can 
be seen very economically by working in the particular gauge for which ¢ is 
real, and inserting the parametrization (cf (19.45)) 


- 1 


Pf 


into the Lagrangian Lu. Retaining only the terms quadratic in the fields one 
finds (problem 19.5) 


(v +h) (19.55) 


POSS E (Ob, — ð Â) (OX AY — 9 AM) + la 22 ÂÂ" 
+ 20 hori — ph. (19.56) 


The first line of (19.56) is exactly the Lagrangian for a spin-l field of mass 
vq — i.e. the Maxwell part with the addition of a mass term (note that the 
sign of the mass term is correct for the spatial (physical) degrees of freedom); 
the second line is the Lagrangian of a scalar particle of mass V2p. The latter 
is the mass of excitations of the Higgs field h away from its vacuum value 
(compare the global U(1) case discussed in section 17.5). The necessity for 
the existence of one or more massive scalar particles (‘Higgs bosons’), when 
a gauge symmetry is spontaneously broken in this way, was pointed out by 
Higgs (1964). 

We may now ask: what happens if we start with a certain phase 6 for $ 
but do not make use of the gauge freedom in A” to reduce Ó to zero? We shall 
see in section 19.5 that the equation of motion, and hence the propagator, 
for the vector particle depends on the choice of gauge; furthermore, Feynman 
graphs involving quanta corresponding to the degree of freedom associated 
with the phase field 6 will have to be included for a consistent theory, even 
though this must be an unphysical degree of freedom, as follows from the fact 
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that a gauge can be chosen in which this field vanishes. That the propagator 
is gauge dependent should, on reflection, come as a relief. After all, if the 
massive vector boson generated in this way were simply described by the 
wave equation (19.50), all the troubles with massive vector particles outlined 
in section 19.1 would be completely unresolved. As we shall see, a different 
choice of gauge from that which renders ¿$ real has precisely the effect of 
ameliorating the bad high-energy behaviour associated with (19.50). This is 
ultimately the reason for the following wonderful fact: massive vector theories, 
in which the vector particles acquire mass through the spontaneous symmetry 
breaking mechanism, are renormalizable ('t Hooft 1971b). 

However, before discussing other gauges than the one in which $ is given 
by (19.55), we first explore another interesting; aspect of superconductivity. 


E a 


19.4 Flux quantization in a superconductor 


Though a slight diversion, it is convenient to include a discussion of flux quan- 
tization at this point, while we have a number of relevant results assembled. 
Apart from its intrinsic interest, the phenomenon may also provide a useful 
physical model for the ‘confining’ property of QCD, as already discussed in 
sections 1.3.6 and 16.5.3. 

Our discussion of superconductivity so far has dealt, in fact, with only 
one class of superconductors, called type I; these remain superconducting 
throughout the bulk of the material (exhibiting a complete Meissner effect), 
when an external magnetic field of less than a certain critical value is applied. 
There is a quite separate class — type II superconductors — which allow partial 
entry of the external field, in the form of thin filaments of flux. Within each 
filament the field is high, and the material is not superconducting. Outside the 
core of the filaments, the material is superconducting and the field dies off over 
the characteristic penetration length A. Around each filament of magnetic flux 
there circulates a vortex of screening current; the filaments are often called 
vortex lines. It is as if numerous thin cylinders, each enclosing flux, had been 
drilled in a block of type I material, thereby producing a non-simply connected 
geometry. 

In real superconductors, screening currents are associated with the macro- 
scopic pair wavefunction (field) y. For type II behaviour to be possible, || 
must vanish at the centre of a flux filament, and rise to the constant value 
appropriate to the superconducting state over a distance € < A, where € is the 
‘coherence length’ of section 17.7. According to the Ginzburg-Landau (GL) 
theory, a more precise criterion is that type II behaviour holds if £ < 212; 
both € and A are, of course, temperature-dependent. The behaviour of |y| and 
B in the vicinity of a flux filament is shown in figure 19.2. Thus, whereas for 
simple type I superconductivity, |/| is simply set equal to a constant, in the 


19.4. Flux quantization in a superconductor 269 


FIGURE 19.2 
Magnetic field B and modulus of the macroscopic (pair) wavefunction |y| in 
the neighbourhood of a flux filament. 


type II case || has the variation shown in this figure. Solutions of the coupled 
GL equations for A and y can be obtained which exhibit this behaviour. 
An important result is that the flux through a vortex line is quantized. To 
see this, we write 
Y = e) (19.57) 


as before. The expression for the electromagnetic current is 


2 
fise Est de (a = ve) Wh? (19.58) 


m 


as in (19.26), but in (19.58) we are leaving the charge parameter q undeter- 
mined for the moment; the mass parameter m will be unimportant. Rear- 
ranging, we have 

mo. Vo 
= —— dem + —- (19.59) 
ca or 
Let us integrate equation (19.59) around any closed loop C in the type II 
superconductor, which encloses a flux (or vortex) line. Far enough away from 
the vortex, the screening currents jem will have dropped to zero, and hence 


f Aas = | Vor ds = (19.60) 
C q Je q 


where [¢]c is the change in phase around C. If the wavefunction y is single- 
valued, the change in phase [¢]c¢ for any closed path can only be zero or an 
integer multiple of 27. Transforming the left-hand side of (19.60) by Stokes’ 
Theorem, we obtain the result that the flux ® through any surface spanning 
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C is quantized: 


v= | Bas =" = na, (19.61) 
q 


where Py = 27 /q is the flux equation (or 27h/q in ordinary units). It is not 
entirely self-evident why y should be single-valued, but experiments do indeed 
demonstrate the phenomenon of flux quantization, in units of Po with |q| = 2e 
(which may be interpreted as the charge on a Cooper pair, as usual). The phe- 
nomenon is seen in non-simply connected specimens of type I superconductors 
(i.e. ones with holes in them, such as a ring), and in the flux filaments of type 
II materials; in the latter case each filament carries a single flux quantum Po. 

It is interesting to consider now a situation — so far entirely hypothetical 
— in which a magnetic monopole is placed in a superconductor. Dirac showed 
(1931) that for consistency with quantum mechanics the monopole strength 
gm had to satisfy the ‘Dirac quantization conduction’ 


qgm = n/2 (19.62) 


where q is any electronic charge, and n is an integer. It follows from (19.62) 
that the flux 47gm out of any closed surface surrounding the monopole is 
quantized in units of Py. Hence a flux filament in a superconductor can 
originate from, or be terminated by, a Dirac monopole (with the appropriate 
sign of gm), as was first pointed out by Nambu (1974). 

This is the basic model which, in one way or another, underlies many 
theoretical attempts to understand confinement. The monopole—antimonopole 
pair in a type II superconducting vacuum, joined by a quantized magnetic flux 
filament, provides a model of a meson. As the distance between the pair — 
the length of the filament — increases, so does the energy of the filament, at a 
rate proportional to its length, since the flux cannot spread out in directions 
transverse to the filament. This is exactly the kind of linearly rising potential 
energy required by hadron spectroscopy (see equations (1.33) and (16.145)). 
The configuration is stable, because there is no way for the flux to leak away; 
it is a conserved quantized quantity. 

For the eventual application to QCD, one will want (presumably) par- 
ticles carrying non-zero values of the colour quantum numbers to be con- 
fined. These quantum numbers are the analogues of electric charge in the 
U(1) case, rather than of magnetic charge. We imagine, therefore, interchang- 
ing the roles of magnetism and electricity in all of the foregoing. Indeed, the 
Maxwell equations have such a symmetry when monopoles are present, as well 
as charges. The essential feature of the superconducting ground state was that 
it involved the coherent state formed by condensation of electrically charged 
bosonic fermion pairs. A vacuum which confined filaments of E rather than B 
may be formed as a coherent state of condensed magnetic monopoles (Man- 
delstam 1976, 't Hooft 1976). These E filaments would then terminate on 
electric charges. Now magnetic monopoles do not occur naturally as solutions 
of QED: they would have to be introduced by hand. Remarkably enough, 
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however, solutions of the magnetic monopole type do occur in the case of 
non-Abelian gauge field theories, whose symmetry is spontaneously broken 
to an electromagnetic U(1)em gauge group. Just this circumstance can arise 
in a grand unified theory which contains SU(3). and a residual U(1)em. In- 
cidentally, these monopole solutions provide an illuminating way of thinking 
about charge quantization: as Dirac (1931) pointed out, the existence of just 
one monopole implies, from his quantization condition (19.62), that charge is 
quantized. 

When these ideas are applied to QCD, E and B must be understood as 
the appropriate colour fields (i.e. they carry an SU(3). index). The group 
structure of SU(3) is also quite different from that of U(1) models, and we do 
not want to be restricted just to static solutions (as in the GL theory, here 
used as an analogue). Whether in fact the real QCD vacuum (ground state) 
is formed as some such coherent plasma of monopoles, with confinement of 
electric charges and flux, is a subject of continuing research; other schemes are 
also possible. As so often stressed, the difficulty lies in the non-perturbative 
nature of the confinement problem. 


Å ne 
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We must now at last grasp the nettle and consider what happens if, in the 
parametrization 


6 = |ê exp(i0()/v) (19.63) 
we do not choose the gauge (cf (19.52)) 


ð A" = 06/M. (19.64) 


This was the gauge that enabled us to transform away the phase degree of 
freedom and reduce the equation of motion for the electromagnetic field to 
that of a massive vector boson. Instead of using the modulus and phase as 
the two independent degrees of freedom for the complex Higgs field ob, we now 
choose to parametrize $, quite generally, by the decomposition 


$ = 2-1? ly + Xa(2) + iXa(2)], (19.65) 


where the vacuum values of X1 and X2 are zero. Substituting this form for $ 
into the master equation for A” (obtained from (19.43) and (19.44)) 


A” —9(9, AM) =ig[ pt 09 (0H G] — 2@ A’ Ot, (19.66) 
leads to the equation of motion 

(D+ M?)A” — 8” (3 Â") = -M3R + 0(%20"%1 — X10"X2) 

— GAY (RE + URERA (19.67) 
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FIGURE 19.3 
A” — Xa coupling. 


with M = qu. At first sight this just looks like the equation of motion of 
an ordinary massive vector field Â” coupled to a rather complicated current. 
However, this certainly cannot be right, as we can see by a count of the degrees 
of freedom. In the previous gauge we had four degrees of freedom, counted 
either as two for the original massless A” plus one each for Ê and h, or as three 
for the massive A” and one for h. If we take this new equation at face value, 
there seem to be three degrees of freedom for the massive field A”, and one for 
each of Xa and Xa, making five in all. Actually, we know perfectly well that 
we can make use of the freedom gauge choice to set Xa to zero, say, reducing 
$ to a real quantity and eliminating a spurious degree of freedom: we have 
then returned to the form (19.55). In terms of (19.67), the consequence of the 
unwanted degree of freedom is quite subtle, but it is basic to all gauge theories 
and already appeared in the photon case, in section 7.3.2. The difficulty arises 
when we try to calculate the propagator for Â” from equation (19.67). 

The operator on the left-hand side can be simply inverted, as was done in 
section 19.1, to yield (apparently) the standard massive vector boson propa- 
gator 

i(— gh + kek’ /M?) /(k? — M?). (19.68) 


However, the current on the right-hand side of (19.67) is rather peculiar: in- 
stead of having only terms corresponding to AY coupling to two or three par- 
ticles, there is also a term involving only one field. This is the term — M8” Xa, 
which tells us that A” actually couples directly to the scalar field x2 via 
the gradient coupling (—M9”). In momentum space this corresponds to a 
coupling strength —ik” M and an associated vertex as shown in figure 19.3. 
Clearly, for a scalar particle, the momentum 4-vector is the only quantity 
that can couple to the vector index of the vector boson. The existence of 
this coupling shows that the propagators of A” and ə are necessarily mixed: 
the complete vector propagator must be calculated by summing the infinite 
series shown diagrammatically in figure 19.4. This complication is, of course, 
completely eliminated by the gauge choice x2 = 0. However, we are interested 
in pursuing the case Xa Æ 0. 

In figure 19.4 the only unknown factor is the propagator for ¥2. This can 
be easily found by substituting (19.65) into £y and examining the part which 
is quadratic in the ¥’s; we find (problem 19.6) 


7 1 1 
Ĉu = 5 X10" kı + OuX20'X — u? R? + cubic and quartic terms. (19.69) 
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FIGURE 19.4 | 
Series for the full A” propagator. 
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FIGURE 19.5 
Formal summation of the series in figure 19.4. 


Equation (19.69) confirms that Xa is a massive field with mass V2p (like the 
h in (19.56)), while Xa is massless. The X2 propagator is therefore i/k?. Now 
that all the elements of the diagrams are known, we can formally sum the 
series by generalizing the well known result ((cf 10.12)and (11.27)) 


(l-ai? 4+... (19.70) 


Diagrammatically, we rewrite the propagator of figure 19.4 as in figure 19.5 and 
perform the sum. Inserting the expressions for the propagators and vector- 
scalar coupling, and keeping track of the indices, we finally arrive at the result 
(problem 19.7) 

: —gt® + k” kò /M? v v = 


for the full propagator. But the inverse required in (19.71) is precisely (with a 
lowered index) the one we needed for the photon propagator in (7.91) — and, 
as we saw there, it does not exist. At last the fact that we are dealing with a 
gauge theory has caught up with us! 

As we saw in section 7.3.2, to obtain a well-defined gauge field propagator 
we need to fix the gauge. A clever way to do this in the present (spontaneously 
broken) case was suggested by ’t Hooft (1971b). His proposal was to set 


9, AY = MEX (19.72) 


where € is an arbitrary gauge parameter! (not to be confused with the su- 
perconducting coherence length). This condition is manifestly covariant, and 
moreover it effectively reduces the degrees of freedom by one. Inserting (19.72) 


1We shall not enter here into the full details of quantization in such a gauge: we shall 
effectively treat (19.72) as a classical field relation. 
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into (19.67) we obtain 


(0+M?) 4” — o” (8 A*)(1—1/€) = a(%20"%1 — X18” X2) 
— PAY (XF +20%1 + 33). (19.73) 


The operator appearing on the left-hand side now does have an inverse (see 
problem 19.8) and yields the general form for the gauge boson propagator 


(1 — E)K" k” 


EP (k? — ML, (19.74) 


i|—g + 


This propagator is very remarkable?. The standard massive vector boson 
propagator 


i(—gh + kk” /M?)(k? — M?y! (19.75) 


is seen to correspond to the limit € > ov, and in this gauge the high-energy 
disease outlined in section 19.1 appears to threaten renormalizability (in fact, 
it can be shown that there is a consistent set of Feynman rules for this gauge, 
and the theory is renormalizable thanks to many cancellations of divergences). 
For any finite €, however, the high-energy behaviour of the gauge boson prop- 
agator is actually ~ 1/k?, which is as good as the renormalizable theory of 
QED (in Lorentz gauge). Note, however, that there seems to be another pole 
in the propagator (19.74) at k? = £M?: this is surely unphysical since it de- 
pends on the arbitrary parameter €. A full treatment (’t Hooft 1971b) shows 
that this pole is always cancelled by an exactly similar pole in the propagator 
for the Xa field itself. These finite-€ gauges are called R gauges (since they 
are ‘manifestly renormalizable’) and typically involve unphysical Higgs fields 
such as Y2. The infinite-€ gauge is known as the U gauge (U for unitary) since 
only physical particles appear in this gauge. For tree diagram calculations, of 
course, it is easiest to use the U gauge Feynman rules: the technical difficulties 
with this gauge choice only enter in loop calculations, for which the R gauge 
choice is easier. 

Notice that in our master formula (19.74) for the gauge boson propagator 
the limit M — 0 may be safely taken (compare the remarks about this limit 
for the ‘naive’ massive vector boson propagator in section 19.1). This yields 
the massless vector boson (photon) propagator in a general £-gauge, exactly 
as in equation (7.122) or (19.23). 

We now proceed with the generalization of these ideas to the non-Abelian 
SU(2) case, which is the one relevant to the electroweak theory. The general 
non-Abelian case was treated by Kibble (1967). 


2 A vector boson propagator of similar form was first introduced by Lee and Yang (1962), 
but their discussion was not within the framework of a spontaneously broken theory, so that 
Higgs particles were not present, and the physical limit was obtained only as € — 0. 
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19.6 Spontaneously broken local SU(2)xU(1) symmetry 


We shall limit our discussion of the spontaneous breaking of a local non- 
Abelian symmetry to the particular case needed for the electroweak part of the 
Standard Model. This is, in fact, just the local version of the model studied 
in section 17.6. As noted there, the Lagrangian Lg of (17.97) is invariant 
under global SU(2) transformations of the form (17.100), and also global U(1) 
transformations (17.101). Thus in the local version we shall have to introduce 
three SU(2) gauge fields (as in section 13.1), which we call WH (x) (i = 1, 2,3), 
and one U(1) gauge field B!(x). We recall that the scalar field $ is an SU(2)- 


doublet . 
A pt 
so that the SU(2) covariant derivative acting on ¢ is as given in (13.10), namely 
DY = 0" +igr -W" /2. (19.77) 


To this must be added the U(1) piece, which we write as ig’ BY /2, the 4 being 
for later convenience. The Lagrangian (without gauge-fixing and ghost terms) 
is therefore 


Low = (DA DA) + 2318 — LN Ea B — 106 (19.78) 
where 
Dig (Ə! + igr -W" /2 + ig’ B"/Dó, (19.79) 
qn WW"-90W"-gW"xWwW", (19.80) 
and i i Ă 
Gey = on BY — av Be. (19.81) 


We must now decide how to choose the non-zero vacuum expectation value 
that breaks this symmetry. The essential point for the electroweak applica- 
tion is that, after symmetry breaking, we should be left with three massive 
boson gauge bosons (which will be the W* and Z°) and one massless gauge 
boson, the photon. We may reasonably guess that the massless boson will 
be associated with a symmetry that is unbroken by the vacuum expectation 
value. Put differently, we certainly do not want a ‘superconducting’ massive 
photon to emerge from the theory in this case, as the physical vacuum is not 
an electromagnetic superconductor. This means that we do not want to give a 
vacuum value to a charged field (as is done in the BCS ground state). On the 
other hand, we do want it to behave as a ‘weak’ superconductor, generating 
mass for WF and ZO. The choice suggested by Weinberg (1967) was 


(id) = a) (19.82) 
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where v/V2 = V2u/\'/?, which we already considered in the global case in 
section 17.6. As pointed out there, (19.82) implies that the vacuum remains 
invariant under the combined transformation of ‘U(1) + third component of 
SU(2) isospin’ — that is, (19.82) implies 


(5 +? ad (0|d|0) = =0 (19.83) 
and hence 


(010) = KOl) = exp fia (5 +11?) olo = oo, 29.84) 


where as usual E 2) = m /2 (we are using lowercase t for isospin now, antici- 
pating that it is the weak, rather than hadronic, isospin — see chapter 21). 

We now need to consider oscillations about (19.82) in order to see the 
physical particle spectrum. As in (17.107) we parametrize these conveniently 
as 


: m 0 
$ = exp(—ið (x) - T/2v) ( Lw + Ao) ) (19.85) 


(compare (19.45)). However this time, in contrast to (17.107) but just as in 
(19.55), we can reduce the phase fields @ to zero by an appropriate gauge 
transformation, and it is simplest to examine the particle spectrum in this 
(unitary) gauge. Substituting 


g 0 
ġ= | 5 (v + f(a) (19.86) 


into (19.78) and retaining only terms which are second order in the fields (i.e. 
kinetic energies or mass terms) we find (problem 19.9) 


1 a x A 
La = 3% Ho"H — wA? 


a 1 on dA 
ze: Wi — Wi) (O WY — oY WY) + ¿Po WWW 


N A Y ~ 1 A S 
(3 Wav 5 O Wa, ) (O W7 E O Wi) ES 390 Wa, Wy 
A be x a ja A 
(3 Wa — 0,W3,M0"W3 — Ə WL) — Ere” 


+ 0 (gŴsu — 9'B,)(gWs — g' Ê“). (19.87) 


A ae za 


The first line of (19.87) tells us that we have a scalar field of mass V2u (the 
Higgs boson, again). The next two lines tell us that the components W, and 
Wa of the triplet (W,, Wa, W3) acquire a mass (cf (19.56) in the U(1) case) 
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The last two lines show us that the fields W3 and B are mixed. But they 
can easily be unmixed by noting that the last term in (19.87) involves only 
the combination gw! =g ‘Be, which evidently acquires a mass. This suggests 
introducing the normalized linear combination 


Zn = cos Oy WZ£ — sin Ow BY (19.89) 
where 
cos6w = 9/(92+ 9212 sindw = 9'/(92+ 92172, (19.90) 
together with the orthogonal combination 


Â! = sinOwW + cos0w BY. (19.91) 


We then find that the last two lines of (19.87) become 


= a a | a 5 
=a y — OZ 0,2" — OZ") + =v? (9? +g’) L2" — gft" , (19.92) 
where A , A 
Fu = Ou Av — Oy Ap. 19.93) 
Thus 1 
Mz = zeo + 92)? = Mw/ cos Ow 19.94) 
and 
Ma =0. 19.95) 


Counting degrees of freedom as in the local U(1) case, we originally had 12 in 
(19.78) — three massless W’s and one massless B, which is 8 degrees of freedom 
in all, together with 4 ĝ- fields, all with the same mass. After symmetry 
breaking, we have three massive vector fields W,, We and Z with 9 degrees 
of freedom, one massless vector field A with 2, and one massive scalar H. 
Of course, the physical application will be to identify the W and Z fields 
with those physical particles, the A field with the massless photon, and the H 
field with the Higgs boson. In the gauge (19.86), the W and Z particles have 
propagators of the form (19.22). 

The identification of A" with the photon field is made clearer if we look 
at the form of Did written in terms of A, and Zs discarding the Wi, Wa 
pieces: 


a 1 . 
D, = (0, + igsin de ( A, 


ig ESS bw (=>) 2u} à (19.96) 


cos Ow 


Now the operator (1 + 73) acting on (0|4|0) gives zero, as observed in (19.83), 
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and this is why A,, does not acquire a mass when (0|4|0) + 0 (gauge fields 
coupled to unbroken symmetries of (0|¢|0) do not become massive). Although 
certainly not unique, this choice of ¢ and (0|d]0) is undoubtedly very econom- 
ical and natural. We are interpreting the zero eigenvalue of (1 + 73) as the 
electromagnetic charge of the vacuum, which we do not wish to be non-zero. 
We then make the identification 


e = gsin 0w (19.97) 


in order to get the right ‘electromagnetic D,,’ in (19.96). 

We emphasize once more that the particular form of (19.87) corresponds 
to a choice of gauge, namely the unitary one (cf the discussions in sections 
19.3 and 19.5). There is always the possibility of using other gauges, as in 
the Abelian case, and this will in general be advantageous when doing loop 
calculations involving renormalization. We would then return to a general 
parametrization such as (cf (19.65) and (17.95)) 


2 0 1 / d2—ió ) 
= F= x ; 19.98 
i Cora) Al ô — ids „pic 
and add ’t Hooft gauge-fixing terms 


-zl 5 (Ə WE + ¿Mwós)? + (3 Ê! + EMzp3)? + Ap). (19.99) 
i=1,2 


In this case the gauge boson propagators are all of the form (19.74), and €- 
dependent. In such gauges, the Feynman rules will have to involve graphs 
corresponding to exchange of quanta of the ‘unphysical’ fields ĝi, as well as 
those of the physical Higgs scalar 6. These will also have to be suitable ghost 
interactions in the non-Abelian sector as discussed in section 13.3.3. The 
complete Feynman rules of the electroweak theory are given in Appendix B 
of Cheng and Li (1984), for example. 

The model introduced here is actually the ‘Higgs sector’ of the Standard 
Model, but without any couplings to fermions. We have seen how, by sup- 
posing that the potential in (19.78) has the symmetry-breaking sign of the 
parameter u?, the W* and ZO gauge bosons can be given masses. This seems 
to be an ingenious and even elegant ‘mechanism’ for arriving at a renormal- 
izable theory of massive vector bosons. One may of course wonder whether 
this ‘mechanism’ is after all purely phenomenological, somewhat akin to the 
GL theory of a superconductor. In the latter case, we know that it can be 
derived from ‘microscopic’ BCS theory, and this naturally leads to the ques- 
tion whether there could be a similar underlying ‘dynamical’ theory, behind 
the Higgs sector. It is, in fact, quite simple to construct a theory in which the 
Higgs fields $ appear as bound, or composite, states of heavy fermions. 

But generating masses for the gauge bosons is not the only job that the 
Higgs sector does, in the Standard Model: it also generates masses for all 
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the fermions. As we will see in chapter 22, the gauge symmetry of the weak 
interactions is a chiral one, which requires that there should be no explicit 
fermion masses in the Lagrangian. We saw in chapter 18 how there is good 
evidence that the strong QCD interactions break chiral symmetry sponta- 
neously, but that there is also a need for small Lagrangian masses for the 
quarks, which break chiral symmetry explicitly (so as to give mass to the 
pions, for example). The leptons are of course not coupled to QCD, and 
we have to assume Lagrangian masses for them too. Thus for both quarks 
and leptons chiral-symmetry-breaking mass terms seem to be required. The 
only way to preserve the weak chiral gauge symmetry is to assume that these 
fermion masses must, in their turn, be interpreted as arising ‘spontaneously’ 
also; that is, not via an explicit mass term in the Lagrangian. The dynamical 
generation of quark and lepton masses would, in fact, be closely analogous 
to the generation of the energy gap in the BCS theory, as we saw in section 
18.1. So we may ask: is it possible to find a dynamical theory which gener- 
ates masses for both the vector bosons, and the fermions? Such theories are 
generically known as ‘technicolour models’ (Weinberg 1979b, Susskind 1979), 
and they have been intensively studied (see, for example, Peskin 1997). One 
problem is that such theories are already tightly constrained by the precision 
electroweak experiments (see chapter 22), and meeting these constraints seems 
to require rather elaborate kinds of models. However, technicolour theories do 
offer the prospect of a new strongly interacting sector, which could possibly 
be probed at the LHC. But such ideas take us beyond the scope of the present 
volume. Within the Standard Model, one proceeds along what seems a more 
phenomenological route, attributing the masses of fermions to their couplings 
with the Higgs field, in a way that will be explained in chapter 22: briefly, the 
couplings have the Yukawa form gr f f d, so that when $ develops a vev v, the 
fermions gain a mass mp = gpvu. 

We now turn, in the last part of the book, to weak interactions and the 
electroweak theory. 
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Problems 


19.1 Show that 


v v g + kyk /M? v 
(=k? p M?) ger a ppr) (Bee EEE) = gy 


19.2 Verify (19.18). 
19.3 Verify (19.19). 


19.4 Verify (19.46). 
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19.5 Insert (19.55) into Ly of (19.40) and derive (19.56) for the quadratic 
terms. 


19.6 Insert (19.65) into Ly of (19.40) and derive the quadratic terms of 
(19.69). 


19.7 Derive (19.71). 


19.8 Write the left-hand side of (19.73) in momentum space (as in (19.4)), 


and show that the inverse of the factor multiplying A“ is (19.74) without the 
“i (cf problem 19.1). 


19.9 Verify (19.87). 
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Plate I 
Comparison between measurements of a, and the theoretical prediction, as a 
function of the energy scale Q (Bethke 2009). (See figure 15.5 on page 129.) 
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Plate II 


The light hadron spectrum of QCD, from Diirr et al. (2008). (See figure 16.12 
on page 190.) 
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Plate III 

Constraints in the f,7 plane. The shaded areas have 95% CL. [Figure repro- 
duced, courtesy Michael Barnett for the Particle Data Group, from the review 
of the CKM Quark-Mixing Matrix by A Ceccucci, Z Ligeti and Y Sakai, sec- 
tion 11 in the Review of Particle Physics, K Nakamura et al. (Partcle Data 
Group) Journal of Physics G 37 (2010) 075021, IOP Publishing Limited.] 
(See figure 20.11 on page 323.) 
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Plate IV 

(a) Number of nf = —1 candidates in the signal region with a BO tag (Ngo) and 
with a BO tag (Ngo), and (b) the measured asymmetry (Ngo — Ngo)/(Ngo + 
Ngo), as functions of t; (c) and (d) are the corresponding distributions for 
the np = +1 candidates. Figure reprinted with permission from Aubert et al. 
(BaBar Collaboration) Phys. Rev. Lett. 99 171803 (2007). Copyright 2007 
by the American Physical Society. (See figure 21.7 on page 341.) 
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Introduction to the Phenomenology of Weak 
Interactions 


Public letter to the group of the Radioactives at the district society meeting 
in Tubingen: 


Physikalisches Institut 

der Eidg. Technischen Hochschule 
Gloriastr. 

Ziirich 


Zúrich, 4. Dec. 1930 


Dear Radioactive Ladies and Gentlemen, 


As the bearer of these lines, to whom I graciously ask you to listen, will 
explain to you in more detail, how because of the ‘wrong’ statistics of 
the N and SLi nuclei and the continuous B-spectrum, I have hit upon a 
desperate remedy to save the “exchange theorem’ of statistics and the law 
of conservation of energy. Namely, the possiblity that there could exist in 
the nuclei electrically neutral particles, that I wish to call neutrons, which 
have the spin > and obey the exclusion principle and which further differ 
from light quanta in that they do not travel with the velocity of light. 
The mass of the neutrons should be of the same order of magnitude as 
the electron mass and in any event not larger than 0.01 proton masses. 
— The continuous P-spectrum would then become understandable by the 
assumption that in P-decay, a neutron is emitted in addition to the electron 
such that the sum of the energies of the neutron and electron is constant. 


I admit that on a first look my way out might seem to be quite unlikely, 
since one would certainly have seen the neutrons by now if they existed. 
But nothing ventured nothing gained, and the seriousness of the matter 
with the continuous P-spectrum is illustrated by a quotation of my hon- 
oured predecessor in office, Mr. Debye, who recently told me in Brussels: 
“Oh, it is best not to think about it, like the new taxes.” Therefore one 
should earnestly discuss each way of salvation. — So, dear Radioactives, 
examine and judge it. — Unfortunately I cannot appear in Túbingen per- 
sonally, since I am indispensable here in Ziirich because of a ball on the 
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night of 6/7 December. — With my best regards to you, and also Mr. 
Back, your humble servant, 


W. Pauli 


Quoted from Winter (2000), pages 4-5. 


At the end of the previous chapter we arrived at an important part of the 
Lagrangian of the Standard Model, namely the terms involving just the gauge 
and Higgs fields. The full electroweak Lagrangian also includes, of course, the 
couplings of these fields to the quarks and leptons. We could at this point sim- 
ply write these couplings down, with little motivation, and proceed at once to 
discuss the empirical consequences. But such an approach, though economi- 
cal, would assume considerable knowledge of weak interaction phenomenology 
on the reader’s part. We prefer to keep this book as self-contained as possible, 
and so in the present chapter we shall provide an introduction to this phe- 
nomenology, following a ‘semi-historical’ route (for fuller historical treatments 
we refer the reader to Marshak et al. 1969, or to Winter 2000, for example). 

Much of what we shall discuss is still, for many purposes, a very useful 
approximation to the full theory at energies well below the masses of the WF 
(~80 GeV) and Z° (~90 GeV). The reason for this is that in the electroweak 
theory (chapter 22), tree-level amplitudes have a structure very similar to that 
in the purely electromagnetic case, namely (see equation (8.101)) 


2 

ih ( La dude [My 2) PA (20.1) 
q -Mwz 

where ¿/, is a weak current, and we are using (19.75) for the propagator of 

the exchanged W or Z bosons. For q? < MẸ z (20.1) becomes proportional 

to the product of two currents; this ‘current—current’ form was for many years 

the basis of weak interaction phenomenology, as we now describe. 


E ooo 


20.1  Fermi's ‘current—current’ theory of nuclear 5-decay, 
and its generalizations 


The first quantum field theory of a weak interaction process was proposed 
by Fermi (1934a,b) for nuclear 6-decay, building on the ‘neutrino hypothesis’ 
of Pauli. In 1930, Pauli (in his ‘Dear Radioactive Ladies and Gentlemen’ 
letter) had suggested that the continuous e” spectrum in (-decay could be 
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FIGURE 20.1 
Four-fermion interaction for neutron P-decay. 


understood by supposing that, in addition to the e~, the decaying nucleus 
also emitted a light, spin-3, electrically neutral particle, which he called the 
‘neutron’. In this first version of the proposal, Pauli regarded his hypothetical 
particle as a constituent of the nucleus. This had the attraction of solving not 
only the problem with the continuous e” spectrum, but a second problem as 
well — what he called the “wrong” statistics of the **N and SLi nuclei. Taking 
14N for definiteness, the problem was as follows. Assuming that the nucleus 
was somehow composed of the only particles (other than the photon) known 
in 1930, namely electrons and protons, one requires 14 protons and 7 electrons 
for the known charge of 7. This implies a half-odd integer value for the total 
nuclear spin. But data from molecular spectra indicated that the nitrogen 
nuclei obeyed Bose-Einstein, not Fermi—Dirac statistics, so that — if the usual 
‘spin-statistics’ connection were to hold — the spin of the nitrogen nucleus 
should be an integer, not a half-odd integer. This second part of Pauli's 
hypothesis was quite soon overtaken by the discovery of the (real) neutron by 
Chadwick (1932), after which it was rapidly accepted that nuclei consisted of 
protons and (Chadwick’s) neutrons. 

However, the 6-spectrum problem remained, and at the Solvay Confer- 
ence in 1933 Pauli restated his hypothesis (Pauli 1934), using now the name 
‘neutrino’ which had meanwhile been suggested by Fermi. Stimulated by the 
discussions at the Solvay meeting, Fermi then developed his theory of P-decay. 
In the new picture of the nucleus, neither the electron nor the neutrino were to 
be thought of as nuclear constituents. Instead, the electron-neutrino pair had 
somehow to be created and emitted in the transition process of the nuclear 
decay, much as a photon is created and emitted in nuclear y-decay. Indeed, 
Fermi relied heavily on the analogy with electromagnetism. The basic process 
was assumed to be the transition neutron—>proton, with the emission of an 
ev pair, as shown in figure 20.1. The n and p were then regarded as “ele- 
mentary’ and without structure (point-like); the whole process took place at a 
single space-time point, like the emission of a photon in QED. Further, Fermi 
conjectured that the nucleons participated via a weak interaction analogue of 
the electromagnetic transition currents frequently encountered in volume 1 for 
QED. In this case, however, rather than having the ‘charge conserving’ form 
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of Up yu for instance, the ‘weak current’ had the form tpy"un, in which the 
charge of the nucleon changed. The lepton pair was also charged, obviously. 
The whole interaction then had to be Lorentz invariant, implying that the 
e vy pair had also to appear in a similar (4-vector) ‘current’ form. Thus a 
‘current—current’ amplitude was proposed, of the form 


Aup une Vp, (20.2) 


where A was a constant. Correspondingly, the process was described field 
theoretically in terms of the local interaction density 


Ada (abe (a) Yur (a). (20.3) 


The discovery of positron P-decay soon followed, and then of electron capture; 
these processes were easily accommodated by adding to (20.3) its Hermitian 
conjugate 


Apala)" pp (@) (2) yu Vo (2), (20.4) 
taking A to be real. The sum of (20.3) and (20.4) gave a good account of 
many observed characteristics of P-decay, when used to calculate transition 
probabilities in first-order perturbation theory. 

Soon after Fermi's theory was presented, however, it became clear that the 
observed selection rules in some nuclear transitions could not be accounted 
for by the forms (20.3) and (20.4). Specifically, in “allowed” transitions (where 
the orbital angular momentum carried by the leptons is zero) it was found 
that, while for many transitions the nuclear spin did not change (AJ = 0), 
for others — of comparable strength — a change of nuclear spin by one unit 
(AJ = 1) occurred. Now, in nuclear decays the energy release is very small 
(~ few MeV) compared to the mass of a nucleon, and so the non-relativistic 
limit is an excellent approximation for the nucleon spinors. It is then easy to 
see (problem 20.1) that, in this limit, the interactions (20.3) and (20.4) imply 
that the nucleon spins cannot ‘flip’. Hence some other interaction(s) must 
be present. Gamow and Teller (1936) introduced the general four-fermion 
interaction, constructed from bilinear combinations of the nucleon pair and of 
the lepton pair, but not their derivatives. For example, the combination 


(ba (x)be (a) (a) (20.5) 
could occur, and also 
ý (t)ar Dalej tr (a) (20.6) 
where : 
Ope = (Yu — Wty): (20.7) 


The non-relativistic limit of (20.5) gives AJ = 0, but (20.6) allows AJ = 1. 
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Other combinations are also possible, as we shall discuss shortly. Note that 
the interaction must always be Lorentz invariant. 

Thus began a long period of difficult experimentation to establish the 
correct form of the B-decay interaction. With the discovery of the muon (in 
1937) and the pion (ten years later) more weak decays became experimentally 
accessible, for example u decay 


po ae +u+v (20.8) 


and 7 decay 
T +e +v. (20.9) 


Note that we have deliberately called all the neutrinos just ‘v’, without any 
particle/antiparticle indication, or lepton flavour label; we shall have more to 
say on these matters in section 20.3. There were hopes that the couplings of 
the pairs (p,n), (v,e—) and (v, 47) might have the same form (‘universality’) 
but the data was incomplete, and in part apparently contradictory. 

The breakthrough came in 1956, when Lee and Yang (1956) suggested that 
parity was not conserved in all weak decays. Hitherto, it had always been as- 
sumed that any physical interaction had to be such that parity was conserved, 
and this assumption had been built into the structure of the proposed P-decay 
interactions, such as (20.3), (20.5) or (20.6). Once it was looked for properly, 
following the analysis of Lee and Yang, parity violation was indeed found to 
be a strikingly evident feature of weak interactions. 


E a 


20.2 Parity violation in weak interactions, 
and V-A theory 


20.2.1 Parity violation 


In 1957, the experiment of Wu et al. (1957) established for the first time that 
parity was violated in a weak interaction, specifically nuclear P-decay. The 
experiment involved a sample of Co (J = 5) cooled to 0.01 K in a solenoid. 
At this temperature most of the nuclear spins are aligned by the magnetic field, 
and so there is a net polarization (J), which is in the direction opposite to 
the applied magnetic field. “Co decays to Ni (J = 4), a AJ = 1 transition. 
The degree of Co alignment was measured from observations of the angular 
distribution of y-rays from Ni. The relative intensities of electrons emitted 
along and against the magnetic field direction were measured, and the results 
were consistent with a distribution of the form 


I(0) = 1-(J)-p/E (20.10) 
= 1-Pucos0 (20.11) 
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where v, p and E are respectively the electron speed, momentum and energy, 
P is the magnitude of the polarization, and @ is the angle of emission of the 
electron with respect to (J). 

Why does this indicate parity violation? To see this, we recall from the 
discussion of the parity operation P in section 4.2.1 that the angular momen- 
tum J is an axial vector such that (J) — (J) under P, while p is a polar 
vector transforming by p + —p. Hence, in the parity-transformed system, 
the distribution (20.11) would have the form 


Ip(0) = 1 + Pucosé (20.12) 


The difference between (20.12) and (20.11) implies that, by performing the 
measurement, we can determine which of the two coordinate systems we must 
in fact be using. The two are inequivalent, in contrast to all the other coordi- 
nate system equivalences which we have previously studied (e.g. under three- 
dimensional rotations, and Lorentz transformations). This is an operational 
consequence of ‘parity violation’. The crucial point in this example, evidently, 
is the appearance of the pseudoscalar quantity (J) - p in (20.10), alongside the 
obviously scalar quantity ‘1’. 

The Fermi theory, employing only vector currents, needs a modification 
to accommodate this result. We saw in section 4.2.1 that a combination of 
vector (‘V’) and axial vector (‘A’) currents would be parity-violating. Indeed, 
after many years of careful experiments, and many false trails, it was even- 
tually established (always, of course, to within some experimental error) that 
the currents participating in Fermi’s current-current interaction are, in fact, 
certain combinations of V-type and A-type currents, for both nucleons and 
leptons. 


20.2.2 V-A theory: chirality and helicity 


Quite soon after the discovery of parity violation, Sudarshan and Marshak 
(1958), and then Feynman and Gell-Mann (1958) and Sakurai (1958), pro- 
posed a specific form for the current-current interaction, namely the V-A 
(“V minus A’) structure. For example, in place of the leptonic combination 
UY Uy, these authors proposed the form t,.-y,,(1 — y5)u,, being the differ- 
ence (with equal weight) of a V-type and an A-type current. For the part 
involving the nucleons the proposal was slightly more complicated, having the 
form Uupyy(l — rys)un where r had the empirical value r ~ 1.2. From our 
present perspective, of course, the hadronic transition is actually occurring at 
the quark level, so that rather than a transition n — p we now think in terms 
of ad — u one. In this case, the remarkable fact is that the appropriate cur- 
rent to use is, once again, essentially the simple ‘V-A’ one, tuy,(1 — y5)ua!. 
This V-A structure for quarks and leptons is fundamental to the Standard 
Model. 


1 We shall see in section 20.7 that a slight modification is necessary. 
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We must now at once draw the reader’s attention to a rather remarkable 
feature of this V-A structure, which is that the (1 — y5) factor can be thought 
of as acting either on the u spinor or on the u spinor. Consider, for example, 
a term U.- Yu (1 — y5)u,. We have 


ys Jue] Byuu 
= Y5 JUe-] Yu tir (20.13) 


To understand the significance of this, it is advantageous to work in the rep- 
resentation (3.40) of the Dirac matrices, in which ys is diagonal, namely 


1 0 o 0 0 1 0 -o 
a Cary Pena) ea o) 
(20.14) 
Readers who have not worked through problem 9.4 might like to do so now; 
we may also suggest a backward glance at section 12.4.2 and chapter 17. 


First of all it is clear that any combination ‘(1 — y5)w’ is an eigenstate of 
Ys with eigenvalue —1: 


ys(1 — ysu = (ys — lu = —(1 — 95) u (20.15) 


using y = 1. In the terminology of section 12.4.2, ‘(1 — y5)u' has definite 
chirality, namely L (‘left-handed’), meaning that it belongs to the eigenvalue 
—1 of y5. We may introduce the projection operators Pr, PL of section 12.4.2, 


R= (=) Pa = (=) (20.16) 


satisfying 
PÈ = Pa P? = P, PrP, = PuPa = 0 Pg +P =1, (20.17) 


and define 
UL = PLu, UR = Pru (20.18) 


for any u. Then 


S 1-5 = E E 2 
UY 5 u2 = Ur u Puuo = Vu Piu 


= unu PLuo = U1 Pr yu 
= ui PL Byuua = UL Yuuo (20.19) 
which formalizes (20.13) and emphasizes the fact that only the chiral L com- 


ponents of the u spinors enter into weak interactions, a remarkably simple 
statement. 
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To see the physical consequences of this, we need the forms of the Dirac 
spinors in this new representation, which we shall now derive explicitly, for 
convenience. As usual, positive energy spinors are defined as solutions of 


(Ø — m)u = 0, so that writing 
(9 
u5 ( X ) (20.20) 


we obtain 


(E-0-pó = mx 
(E+oa-p)x = mo. (20.21) 


A convenient choice of 2-component spinors ¢, x is to take them to be helicity 
eigenstates (see section 3.3). For example, the eigenstate ¢4 with positive 
helicity A = +1 satisfies 

o: po+ = |plo+ (20.22) 
while the eigenstate ¢_ with A = —1 satisfies (20.22) with a minus on the 
right-hand side. Thus the spinor u(p, A= +1) can be written as 


u(p,\ = +1) =N ( a bs ) . (20.23) 


The normalization N is fixed as usual by requiring wu = 2m, from which it 
follows (problem 20.2) that N = (E + |p|)!/?. Thus finally we have 


u(p, A = +1) e) (20.24) 


Similarly 
--m=( vE-lpló- 
u(p, A=-—1) = ( Pele ) ; (20.25) 


Now we have agreed that only the chiral ‘L’ components of all u-spinors 
enter into weak interactions, in the Standard Model. But from the explicit 
form of ys given in (20.14), we see that when acting on any spinor u, the 
projector Py, ‘kills’ the top two components: 


Ce (pa (20.26) 


Pyu(p, A = +1) =( mw.) (20.27) 


In particular 


and 


PGs == ( eae ) (20.28) 


20.2. Parity violation in weak interactions, and V-A theory 291 


Equations (20.27) and (20.28) are very important. In particular, equation 
(20.27) implies that in the limit of zero mass m (and hence E — |p|), only 
the negative helicity u-spinor will enter. More quantitatively, using 


form < E, (20.29) 


we can say that positive helicity components of all fermions are suppressed in 
V-A matrix elements, relative to the negative helicity components, by factors 
of order (m/E). Bearing in mind that the helicity operator o : p/|p| is a 
pseudoscalar, this ‘unequal’ treatment for A = +1 and A = —1 components is, 
of course, precisely related to the parity violation built in to the V-A structure. 


A similar analysis may be done for the v-spinors. They satisfy (p+m)v = 0 


and the normalization tv = —2m. We must however remember the ‘small 
subtlety’ to do with the labelling of v-spinors, discussed in section 3.4.3: the 
2-component spinors x_ in v(p, A = +1) actually satisfy o - px- = —|p|x-, 


and similarly the y+’s in v(p, A = —1) satisfy o - px+ = |p|x+. We then find 
(problem 20.3) the results 


-VE — |plx- ) 
v(p,A=+1) = 20.30 
i E+ plx- ene 
and 
E + |plx+ ) 
v(A = —1) = A 20.31 
l ) ( -yE = |p|x+ ( ) 


Once again, the action of PL removes the top two components, leaving the re- 
sult that, in the massless limit, only the A = +1 state survives. Recalling the 
‘hole theory’ interpretation of section 3.4.3, this would mean that the positive 
helicity components of all antifermions dominate in V-A interactions, negative 
helicity components being suppressed by factors of order m/E. The propor- 
tionality of the negative helicity amplitude to the mass of the antifermion is 
of course exactly as noted for 7+ — utv, decay in section 18.2. 

We should emphasize that although the above results, stated in italics, 
were derived in the convenient representation (20.14) for the Dirac matrices, 
they actually hold independently of any choice of representation. This can be 
shown by using general helicity projection operators. 


In Pauli’s original letter, he suggested that the mass of the neutrino might 
be of the same order as the electron mass. Immediately after the discovery of 
parity violation, it was realized that the result could be elegantly explained 
by the assumption that the neutrinos were strictly massless particles (Landau 
1957, Lee and Yang 1957 and Salam 1957). In this case, u and v spinors 
satisfy the same equation y(u or v) = 0, which reduces via (20.21) (in the 
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m = 0 limit) to the two independent two-component ‘Weyl’ equations. 


Epo =0 - po Exo =-0-pxo- (20.32) 


Remembering that E = |p| for a massless particle, we see that do has positive 
helicity and yo negative helicity. In this strictly massless case, helicity is 
Lorentz invariant, since the direction of p cannot be reversed by a velocity 
transformation with v < c. Furthermore, each of the equations in (20.32) 
violates parity, since E is clearly a scalar while ø - p is a pseudoscalar (note 
that when m Æ 0 we can infer from (20.21) that, in this representation, d + x 
under P, which is consistent with (20.32) and with the form of 8 in (20.14)). 
Thus the (massless) neutrino could be ‘blamed’ for the parity violation. In 
this model, neutrinos have one definite helicity, either positive or negative. As 
we have seen, the massless limit of the (four-component) V-A theory leads to 
the same conclusion. 

Which helicity is actually chosen by Nature was determined in a classic 
experiment by Goldhaber et al. (1958), involving the K-capture reaction 


e7 +15? Eu > v +5? Sm*, (20.33) 


as described by Bettini (2008), for example. They found that the helicity 
of the emitted neutrino was (within errors) 100% negative, a result taken as 
confirming the ‘2-component’ neutrino theory, and the V-A theory. 

We now know that neutrinos are not massless. This information does not 
come from studies of nuclear decays, but rather from a completely different 
phenomenon — that of neutrino oscillations, which we shall mention again in 
the following section, and treat more fully in section 21.4. Neutrino masses 
are so small that the existence of the ‘wrong helicity’ component cannot be 
detected experimentally in processes such as (20.33), or indeed in any of the 
reactions we shall discuss, apart from neutrino oscillations. 

In section 4.2.2 we introduced the charge conjugation operation C (see also 
section 7.5.2). As we noted there, C is not a good symmetry in weak interac- 
tions. The V-A interaction treats a negative helicity fermion very differently 
from a negative helicity antifermion, while one is precisely transformed into 
the other under C. However, it is clear that the helicity operator itself is 
odd under P. Thus the CP conjugate of a negative helicity fermion is posi- 
tive helicity antifermion, which is what the V-A interaction selects. It may 
easily be verified (problem 20.4) that the ‘2-component’ theory of (20.32) 
automatically incorporates CP invariance. Elegance notwithstanding, how- 
ever, there are CP-violating weak interactions, as mentioned in section 4.2.3. 
How this is accommodated within the Standard Model we shall discuss in 
section 20.7.3. 

For charged fermions the distinction between particle and antiparticle is 
clear; but is there a conserved quantum number which we can use instead of 
charge to distinguish a neutrino from an antineutrino? That is the question 
to which we now turn. 
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20.3 Lepton number and lepton flavours 


In section 1.2.1 of volume 1 we gave a brief discussion of leptonic quantum 
numbers (“lepton flavours”), adopting a traditional approach in which the data 
is interpreted in terms of conserved quantum numbers carried by neutrinos, 
which serve to distinguish neutrinos from antineutrinos. We must now exam- 
ine the matter more closely, in the light of what we have learned about the 
helicity properties of the V-A interaction. 

In 1995, Davis (1955) — following a suggestion made by Pontecorvo (1946) — 
argued as follows. Consider the e” capture reaction e” +p > v+n, which was 
of course well established. Then in principle the inverse reaction v+n — e~ +p 
should also exist. Of course, the cross section is extremely small, but by using 
a large enough target volume this might perhaps be compensated. Specifically, 
the reaction v +37 Cl — e~ +3 Ar was proposed, the argon being detected 
through its radioactive decay. Suppose, however, that the ‘neutrinos’ actually 
used are those which accompany electrons in 8~-decay. If (as was supposed 
in section 1.2.1) these are to be regarded as antineutrinos, ‘D’, carrying a 
conserved lepton number, then the reaction 


PAS LOL eT +33 Ar (20.34) 


should not be observed. If, on the other hand, the ‘v’ in the capture process 
and the ‘p’ in -decay are not distinguished by the weak interaction, the 
reaction (20.34) should be observed. Davis found no evidence for reaction 
(20.34), at the expected level of cross section, a result which could clearly be 
interpreted as confirming the ‘conserved electron number hypothesis’. 

However, another interpretation is possible. The e” in P-decay has pre- 
dominately negative helicity, and its accompanying ‘D’ has predominately pos- 
itive helicity. The fraction of the other helicity present is of the order m/E, 
where E ~ few Mev, and the neutrino mass is less than 1eV; this is, therefore, 
an almost undetectable ‘contamination’ of negative helicity component in the 
‘D. Now the property of the V-A interaction is that it conserves helicity in 
the zero mass limit (in which chirality is the same as helicity). Hence the 
positive helicity “ from 8~-decay will (predominately) produce a positive 
helicity lepton, which must be the e* not the e~. Thus the property of the 
V-A interaction, together with the very small value of the neutrino mass, con- 
spire effectively to forbid (20.34), independently of any considerations about 
‘lepton number’. 

Indeed, the ‘helicity-allowed’ reaction 


DP+p>et+n (20.35) 


was observed by Reines and Cowan (1956) (see also Cowan et al. 1956). Reac- 
tion (20.35) too, of course, can be interpreted in terms of ‘D’ carrying a lepton 
number of -1, equal to that of the et. It was also established that only ‘v’ 
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produced e~ via (20.34), where ‘v’ is the helicity —1 state (or, on the other 
interpretation, the carrier of lepton number +1). 

The situation may therefore be summarized as follows. In the case of e” 
and e*, all four ‘modes’ — e7 (A = +1),e7(A = -1), eT (A = 41), eT (A = —1) 
— are experimentally accessible via electromagnetic interactions, even though 
only two generally dominate in weak interactions (e7 (A = —1) and eT (A = 
+1)). Neutrinos, on the other hand, seem to interact only weakly. In their 
case, we may if we wish say that the participating states are (in association 
with e~ or et) De (A = +1) and %(A = —1), to a very good approximation. 
But we may also regard these two states as simply two different helicity states 
of one particle, rather than of a particle and its antiparticle. As we have seen, 
the helicity rules do the job required just as well as the lepton number rules. In 
short, the question is: are these ‘neutrinos’ distinguished only by their helicity, 
or is there an additional distinguishing characteristic (‘electron number’)? 
In the latter case we should expect the ‘other’ two states Delà = —1) and 
Ve(À = +1) to exist as well as the ones known from weak interactions. 

If, in fact, no quantum number — other than the helicity — exists which 
distinguishes the neutrino states, then we would have to say that the C- 
conjugate of a neutrino state is a neutrino, not an antineutrino — that is, 
‘neutrinos are their own antiparticles’. A neutrino would be a fermionic state 
somewhat like a photon, which is of course also its own antiparticle. Such 
‘C-self-conjugate’ fermions are called Majorana fermions (Majorana 1937), in 
contrast to the Dirac variety, which have all four possible modes present (2 
helicities, 2 particle/antiparticle). We discussed Majorana fermions in sections 
4.2.2 and 7.5.2. 

The distinction between the ‘Dirac’ and ‘Majorana’ neutrino possibilities 
becomes an essentially ‘metaphysical’ one in the limit of strictly massless neu- 
trinos, since then (as we have seen) a given helicity state cannot be flipped 
by going to a suitably moving Lorentz frame, nor by any weak (or electro- 
magnetic) interaction, since they both conserve chirality which is the same as 
helicity in the massless limit. We would have just the two states ve(A = —1) 
and (A = +1), and no way of creating ve(A = +1) or De(A = —1). The 
‘~* label then becomes superfluous. Unfortunately, the massless limit is ap- 
proached smoothly, and neutrino masses are, in fact, so small that the ‘wrong 
helicity’ supression factors will make it very difficult to see the presence of the 
possible states ve(A = +1), De(A = —1). 

One much-discussed experimental test case (see, for example, the review 
by Vogel and Piepke in Nakamura et al. 2010) concerns 'neutrinoless double 
B-decay”, which is the process A > A’ + e” +e7, where A, A’ are nuclei. If 
the neutrino emitted in the first P-decay carries no electron-type conserved 
quantum number, then in principle it can initiate a second weak interaction, 
exactly as in Davis’ original argument, via the diagram shown in figure 20.2. 
Note that this is a second-order weak process, so that the amplitude contains 
the very small factor G}. Furthermore, the y emitted along with the e” 
at the first vertex will be predominately A = +1, but in the second vertex 
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FIGURE 20.2 
Double P-decay without emission of a neutrino, a test for Majorana-type neu- 
trinos. 


the V-A interaction will ‘want’ it to have A = —1, like the outgoing e”. 
Thus there is bound to be one 'm/E” suppression factor, whichever vertex we 
choose to make ‘easy’. (In the case of 3-state neutrino mixing — see section 
21.4 — the quantity ‘m’ will be an appropriately averaged mass.) There is 
also a complicated nuclear physics overlap factor. The expected half-lives of 
neutrinoless double 8 decays depend on the decaying nucleus, but are typically 
longer than 1074 — 10?° years. Evidently, the observation of this rare process 
is a formidable experimental challenge; as yet, no confirmed observation exists 
(see also section 21.4.5). 
In the same way, 'D” particles accompanying the uy” 's in 77 decay 


Top +0" (20.36) 
are observed to produce only put's when they interact with matter, not pu 's. 
Again this can be interpreted either in terms of helicity conservation or in 
terms of conservation of a leptonic quantum number L,,. We shall assume the 
analogous properties are true for the ‘7””’s accompanying 7 leptons. 
On the other hand, helicity arguments alone would allow the reaction 


‘pe +p jet tn (20.37) 


to proceed, but as we saw in section 1.2.1 the experiment of Danby et al. 
(1962) found no evidence for it. Thus there is evidence, in this type reac- 
tion, for a flavour quantum number distinguishing neutrinos which interact 
in association with one kind of charged lepton from those which interact in 
association with a different charged lepton. The electroweak sector of the 
Standard Model was originally formulated on the assumption that the three 
lepton flavours Le, L, and L, are conserved, and that the neutrinos are mass- 
less. It turns out that these two assumptions are related, in the sense that 
if neutrinos have mass, then (barring degeneracies) ‘neutrino oscillations’ can 
occur, in which a state of one lepton flavour can acquire a component of an- 
other, as it propagates. Compelling evidence accumulated during the 2000s 
for oscillations of neutrinos caused by non-zero masses and neutrino mixing. 
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Strictly speaking, neutrino masses and oscillations lie outside the framework 
of the original Standard Model, and they are sometimes so regarded. Apart 
from anything else, the phenomenology of massive neutrinos has to allow for 
the possibility that they are Majorana, rather than Dirac, fermions. For the 
moment, we shall continue with a semi-historical path, and proceed with weak 
interaction phenomenology on the basis of the original Standard Model, with 
massless neutrinos. We return to the question of neutrino mass when we dis- 
cuss neutrino oscillations (along with analogous oscillations in meson systems) 
in chapter 21. 


E: SSe SS 


20.4 The universal current x current theory for weak 
interactions of leptons 


After the breakthroughs of parity violation and V-A theory, the earlier hopes 
(Pontecorvo 1947, Klein 1948, Puppi 1948, Lee, Rosenbluth and Yang 1949, 
Tiomno and Wheeler 1949) were revived of a universal weak interaction among 
the pairs of particles (p,n), (Ve,e7), (Vu, 47), using the V-A modification to 
Fermi’s theory. From our modern standpoint, this list has to be changed 
by the replacement of (p,n) by the corresponding quarks (u,d), and by the 
inclusion of the third lepton pair (v,,T”) as well as two other quark pairs 
(c,s) and (t,b). It is to these pairs that the ‘V-A’ structure applies, as already 
indicated in section 20.2.2, and a certain form of ‘universality’ does hold, as 
we now describe. 

Because of certain complications which arise, we shall postpone the dis- 
cussion of the quark currents until section 20.7, concentrating here on the 
leptonic currents”. In this case, Fermi’s original vector-like current bey! dy 
becomes modified to a total leptonic charged current 


Ec lleptons) = Jul) + In) + Ha (7) (20.38) 


where, for example, 
Jr le) = De" (1 — m). (20.39) 


In (20.39) we are now adopting, for the first time, a useful shorthand whereby 
the field operator for the electron field, say, is denoted by ê(x) rather than 
we(x), and the ‘x’ argument is suppressed. The ‘charged’ current terminology 
refers to the fact that these weak current operators jee carry net charge, in 
contrast to an electromagnetic current operator such as éy“é which is elec- 
trically neutral. We shall see in section 20.6 that there are also electrically 
neutral weak currents. 


2Very much the same complications arise for the leptonic currents too, in the case of 
massive neutrinos, as we shall see in section 21.4. 
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The interaction Hamiltonian density accounting for all leptonic weak in- 
teractions is then taken to be 


~Je GF + 4 
HE, = — JE (leptons) je (leptons). (20.40) 
V2 
Note that E y 
Bey" (1 — 75)é)' = êy" (1 — 75) (20.41) 


and similarly for the other bilinears. The currents can also be written in terms 
of the chiral components of the fields (recall section 20.2.2) using 


Were, =Dey*(1— 75)é, (20.42) 


for example. ‘Universality’ is manifest in the fact that all the lepton pairs 
have the same form of the V-A coupling, and the same ‘strength parameter’ 
Gp/vy2 multiplies all of the products in (20.40). 

The terms in (20.40), when it is multiplied out, describe many physical 
processes. For example, the term 


G ES ATA A 
el — 75) (1 éyu(1 — 15) Pe (20.43) 


describes u~ decay: 
pute +, (20.44) 
as well as all the reactions related by ‘crossing’ particles from one side to the 


other, for example 
Vuy +E +p ve. (20.45) 


The value of Gp can be determined from the rate for process (20.44) (see for 
example Renton 1990, section 6.1.2), and it is found to be 


Gr œ 1.166 x 10 9GeV 2. (20.46) 


This is a convenient moment to notice that the theory is not renormalizable 
according to the criteria discussed in section 11.8 at the end of the previous 
volume: Gp has dimensions (mass) 2. We shall return to this aspect of Fermi- 
type V-A theory in section 22.1. 

There are also what we might call “diagonal” terms in which the same 
lepton pair is taken from j/, and ŞI w for example 


G ES AR A 
ZED" (1 — Y5)é éyu(1 — 15)De (20.47) 


v2 


which describes reactions such as 
Vete > De +e. (20.48) 


The cross section for (20.48) was measured by Reines, Gurr and Sobel (1976) 
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after many years of effort; the value obtained was consistent with the Glashow- 
Salam-Weinberg theory (see section 22.3), with the parameter sin? 0w = 
0.29 + 0.05. 

It is interesting that some seemingly rather similar processes are forbidden 


a ae 
to occur, to first order in He, for example 


Duke te. (20.49) 


For reasons which will become clearer in section 20.6, (20.49) is called a ‘neu- 
tral current’ process, in contrast to all the others (such as 6-decay or pi-decay) 
we have discussed so far, which are called ‘charged current’ processes. If the 
lepton pairs are arranged so as to have no net lepton number (for example 
e` De, Lt Vy, UD, etc.) then pairs with non-zero charge occur in charged cur- 
rent processes, while those with zero charge participate in neutral current 
processes. In the case of (20.48), the leptons can be grouped either as (Zee”) 
which is charged, or as (eve) or (ete) which are neutral. On the other 
hand, there is no way of pairing the leptons in (20.49) so as to cancel the lep- 
ton number and have non-zero charge. So (20.49) is a purely ‘neutral current’ 
process, while some ‘neutral current’ contribution could be present in (20.48), 
in principle. In 1973 such neutral current processes were discovered (Hasert 
et al. 1973), generating a whole new wave of experimental activity. Their ex- 
istence had, in fact, been predicted in the first version of the Standard Model, 
due to Glashow (1961). Today we know that charged current processes are 
mediated by the W* bosons, and the neutral current ones by the ZO. We shall 
discuss the neutral current couplings in section 20.6. 


20.5 Calculation of the cross section for v, +e” —> W~ +0 


After so much qualitative discussion it is time to calculate something. We 
choose the process (20.45), sometimes called inverse muon decay, which is a 
pure ‘charged current’ process. The amplitude, in the Fermi-like V-A current 
theory, is 


M = ~i(Gp/V2)t(u, kyu (1 — 98)u(Yu, Kalve py y" (1 — 75)u(e,p). (20.50) 


We shall be interested in energies much greater than any of the leptons, and 
so we shall work in the massless limit; this is mainly for ease of calculation — 
the full expressions for non-zero masses can be obtained with more effort. 
From the general formula (6.129) for 2 — 2 scattering in the CM system, 
we have, neglecting all masses, 
do 1 IM: 


dQ 6472s 


(20.51) 


where |M|? is the appropriate spin-averaged matrix element squared, as in 
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(8.183) for example. In the case of neutrino-electron scattering, we must aver- 
age over initial electron states for unpolarized electrons and sum over the final 
muon polarization states. For the neutrinos there is no averaging over initial 
neutrino helicities, since only left-handed (massless) neutrinos participate in 
the weak interaction. Similarly, there is no sum over final neutrino helicities. 
However, for convenience of calculation, we can in fact sum over both helicity 
states of both neutrinos since the (1 — ys) factors guarantee that right-handed 
neutrinos contribute nothing to the cross section. As for the eu scattering 
example in section 8.7, the calculation then reduces to a product of traces: 


2 


mal (E) TRC = a) rau oo) loa = 15) pun) 


(20.52) 
all lepton masses being neglected. We define 
== G2 
IMI = ($) Np BE (20.53) 


where the v, — pu tensor Ny, is given by 


Nu = Tr[ Py, (1 = ys) Wu (1 —75)] (20.54) 


without a 1/(2s + 1) factor, and the e” —> ve tensor is 


Er = ¿DB = 98) AA- (20.55) 


including a factor of 4 for spin averaging. 

Since this calculation involves a couple of new features, let us look at it in 
some detail. By commuting the (1 — y5) factor through two y matrices ( øy”) 
and using the result that 


(1 — 95)? = 2(1 — %) (20.56) 
the tensor N,,, may be written as 


Nu = 2Tr[ K y a ys) ky] 
= 2Tr( hy Kv) — 2Tr(ys Kv Wu). (20.57) 


The first trace is the same as in our calculation of ew scattering (cf (8.186)): 
Ty Kw) = Ali + kiku + (97/2) 90). (20.58) 

The second trace must be evaluated using the result 
Tr(ys d Y g d) = 4icapysa® b? d? (20.59) 


(see equation (J.37) in appendix J of volume 1). The totally antisymmetric 
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tensor €. ys is just the generalization of €;;, to four dimensions, and is defined 
by 


+1 for €9123 and all even permutations of 0,1, 2,3 
Ea bys = —1 for €1023 and all odd permutations of 0,1,2,3 
0 otherwise. 
(20.60) 


Its appearance here is a direct consequence of parity violation. Notice that 
this definition has the consequence that 


€0123 = +1 (20.61) 


but 
era (20.62) 


We will also need to contract two e tensors. By looking at the possible com- 
binations, it should be easy to convince yourself of the result 


(ón Ôj 
Eijk Eilm = | Onl Skim | (20.63) 
Le. 
€igk€ilm = ÓjlÓkm — Ók10jm. (20.64) 
For the four-dimensional € tensor one can show (see problem 20.6) 
oy 6) 
Eva"? = —2! | 5 F (20.65) 
a B 


where the minus sign arises from (20.62) and the 2! from the fact that the two 
indices are contracted. 

We can now evaluate N,v. We obtain, after some rearrangement of indices, 
the result for the v, — pu” tensor: 


Nu = 8|(ki ky + ko ku + (47/2) uv) — i€uvapk®“k’®]. (20.66) 
For the electron tensor E*” we have a similar result (divided by 2): 
EX” = 4|(p'"p” + p” p" + (4? /2)g"”) — ie” papi]. (20.67) 


Next, we have to perform the contraction N,,E"” in (20.53). In the case 
of elastic eu” scattering considered in section 8.7, the analogous contraction 
between the tensors L,, and M*” was simplified by using the conditions 
qh Luv = q” Lu = 0 (see (8.189)), which followed from electromagnetic current 
conservation at the electron vertex (see (8.188)): g’u(k’)y,,u(k) = 0. Here, the 
analogous vertex is u(y, k’)y,(1 — y5)u(v,, k). In this case, when we contract 
this with q = (k — k’)" we find a non-zero result: 


(mp, — m,¿JU(p, k’)u(vy, k) + (My + m,, Ulu, k’)ysu(vyp, k), (20.68) 
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using the on-shell conditions for the spinors. (In the electromagnetic case, 
there was no y5 term, and the intial and final masses were the same.) The 
quantity (20.68) vanishes only when the lepton masses vanish, and that is the 
approximation we shall make: i.e. we shall neglect all lepton masses. Then 


gr Nw = č Nw = 0, (20.69) 


and we may write 
p=p+q (20.70) 


and drop all terms involving q in the contraction with N,,. In the antisym- 
metric term, however, we have 


o (ps +95) =p, (20.71) 


since the term with ps vanishes because of the antisymmetry of €,,, 5. Thus 
we arrive at 
Fi = 3php” + 29" = dict”, q5. (20.72) 


We must now evaluate the 'N - E” contraction in (20.53). Since we are 
neglecting all masses, it is easiest to perform the calculation in invariant form 
before specializing to the ‘laboratory’ frame. The usual Mandelstam variables 
are (neglecting all masses) 


s = 2k-p (20.73) 
u = —2k'-p (20.74) 
t = 2k k=? (20.75) 
satisfying 
s+t+u=0. (20.76) 


The result of performing the contraction 
Np Daia = Np Elg (20.77) 


may be found using the result (20.65) for the contraction of two e tensors (see 
problem 20.6): the answer for v,e7 — H7 Ve is 


Nyy E” = 16(s? + u2) + 16(s? — u?) (20.78) 


where the first term arises from the symmetric part of N, similar to Ly», 
and the second term from the antisymmetric part involving euvag. We have 
also used 

t= =-(s+u) (20.79) 


valid in the approximation in which we are working. Thus for 1,” — pi ve 
we have 
Nyy Eh” = +328? (20.80) 
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and with . 
do 1 Gh 
— = —— | — ] N,, EY 20.81 
dQ aan (E)r ee 
we finally obtain the result 
do Gis 
40 ae (20.82) 
The total cross section is then 
G2 
= ZF (20.83) 
T 
Since t = —2p?(1 — cos 0), where p is the CM momentum and 0 the CM 


scattering angle, (20.82) can alternatively be written in invariant form as 
(problem 20.7) 
da G2 
dt m 
All other purely leptonic processes may be calculated in an analogous fashion 
(see Bailin 1982 and Renton 1990 for further examples). 

When we discuss deep inelastic neutrino scattering in section 20.7.2, we 
shall be interested in neutrino ‘laboratory’ cross sections, as in the electron 
scattering case of chapter 9. A simple calculation gives s ~ 2m¿E (neglecting 
squares of lepton masses by comparison with me E), where E is the ‘laboratory’ 
energy of a neutrino incident, in this example, on a stationary electron. It 
follows that the total ‘laboratory’ cross section in this Fermi-like current- 
current model rises linearly with E. We shall return to the implications of this 
in section 20.7.2. 

The process (20.45) was measured by Bergsma et al. (1983) using the 
CERN wide band beam (E, ~ 20 GeV). The ratio of the observed number 
of events to that expected for pure V-A was quoted as 0.98+0.12. 


(20.84) 


E: 
20.6 Leptonic weak neutral currents 


The first observations of the weak neutral current process Due” — ue” were 
reported by Hasert et al. (1973), in a pioneer experiment using the heavy- 
liquid bubble chamber Gargamelle at CERN, irradiated with a P, beam. As 
in the case of the charged currents, much detailed experimental work was 
necessary to determine the precise form of the neutral current couplings. They 
are, of course, predicted by the Glashow-Salam-Weinberg theory, as we shall 
explain in chapter 22. For the moment, we continue with the current-current 
approach, parametrizing the currents in a convenient way. 

There are two types of ‘neutral current’ couplings, those involving neutri- 
nos of the form 91... x, and those involving the charged leptons of the form 
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i... i. We shall assume the following form for these currents (with one eye on 
the GSW theory to come): 


(1) neutrino neutral current 


= tS 
gnc yy" ( £) mH l=e,pT; (20.85) 
(2) charged lepton neutral current 


(Las) Ai E (20.86) 


gui! eo + CR z 


This is, of course, by no means the most general possible parametrization. 
The neutrino coupling is retained as pure ‘V-A’, while the coupling in the 
charged lepton sector is now a combination of ‘V-A’ and ‘V+A’ with certain 
coefficients cl and Ga We may also write the coupling in terms of ‘V’ and 
‘A’ coefficients defined by cl, = cl + ch, ck = c — ch. An overall factor gn 
determines the strength of the neutral currents as compared to the charged 
ones; the c’s determine the relative amplitudes of the various neutral current 
processes. 

As we shall see, an essential feature of the GSW theory is its prediction 
of weak neutral current processes, with couplings determined in terms of one 
parameter of the theory called ‘Ow’, the ‘weak mixing angle’ (Glashow 1961, 
Weinberg 1967). The GSW predictions for the parameter gu and the c’s are 
(see equations (22.59)-(22.62)) 

v 1 l 1 l 
gn =g/cos@w, c” => e = gta CR=a (20.87) 
for l = e, 4,7, where a = sin? Ow and g is the SU(2) gauge coupling. Note 
that a strong form of ‘universality’ is involved here too: the coefficients are 
independent of the ‘flavour’ e, u or T, for both neutrinos and charged leptons. 

The following reactions are available for experimental measurement (in 
addition to the charged current process (20.45) already discussed): 


Wye —> Wye, Dye —D,e (NC) (20.88) 
we > Ke, De >Re (NC+CC) (20.89) 


where ‘NC’ means neutral current and ‘CC’ charged current. Formulae for 
these cross sections are given in section 22.3. The experiments are discussed 
and reviewed in Commins and Bucksbaum (1983), Renton (1990), and by 
Winter (2000). All observations are in excellent agreement with the GSW 
predictions, with Oy determined as sin? 0w = 0.23. The reader must note, 
however, that modern precision measurements are sensitive to higher-order 
(loop) corrections, which must be included in comparing the full GSW theory 
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with experiment (see section 22.6). The simultaneous fit of data from all four 
reactions in terms of the single parameter 0w provides already strong confir- 
mation of the theory — and indeed such confirmation was already emerging 
in the late 1970’s and early 1980’s, before the actual discovery of the WS 
and ZO bosons. It is also interesting to note that the presence of vector (V) 
interactions in the neutral current processes may suggest the possibility of 
some kind of link with electromagnetic interactions, which are of course also 
‘neutral’ (in this sense) and vector-like. In the GSW theory, this linkage is 
provided essentially through the parameter 0w, as we shall see. 


E: ooo SR 


20.7 Quark weak currents 


We now turn our attention to the weak interactions of quarks. We shall begin 
by considering an earlier world, when only two generations (four flavours) 
were known. 


20.7.1 Two generations 


The original version of V-A theory was framed in terms of a nucleonic current 
of the form boy" (1 — rosin. With the acceptance of quark substructure it 
was natural to re-interpret such a hadronic transition by a charged current of 
the form ûy”( 1-35), very similar to the charged lepton currents; indeed, here 
was a further example of “universality”, this time between quarks and leptons. 
Detailed comparison with experiment showed, however, that such d > u 
transitions were very slightly weaker than the analogous leptonic ones; this 
could be established by comparing the rates for n — pe De and fi — ue De. 

But for quarks (or their hadronic composites) there is a further complica- 
tion, which is the very familiar phenomenon of flavour change in weak hadronic 
processes (recall the discussion in section 1.2.2). The first step towards the 
modern theory of quark currents was taken by Cabibbo (1963); in a sense, it 
restored universality. Cabibbo postulated that the strength of the hadronic 
weak interaction was shared between the AS = 0 and AS = 1 transitions 
(where S is the strangeness quantum number), the latter being relatively 
suppressed as compared to the former. According to Cabibbo's hypothesis, 
phrased in terms of quarks, the total weak charged current for u, d and s 
quarks is 


1%), 


d + sin dci Us, (20.90) 


R E 1— 
ÍCab (u, d, s) = COS Aci" UE) 


where 8c is the ‘Cabibbo angle’ (not to be confused with Ow). We can now 
postulate a total weak charged current 


jig (total) = vee (leptons) + Fea» (Us d,s), (20.91) 
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FIGURE 20.3 
Strangeness-changing semi-leptonic weak decays. 


where ISo (leptons) is given by (20.38), and then generalize (20.40) to 
Hist, = Gr su (total) jc, (total) (20.92) 
Ce Woes ICC : : 


The effective interaction (20.92) describes a great many processes. The 
purely leptonic ones discussed previously are, of course, present in the term 
jug (leptons) jh, (leptons). But there are also now all the semi-leptonic pro- 
cesses such as the AS = 0 (strangeness conserving) one 


d>ou+e +i, (20.93) 
and the AS = 1 (strangeness changing) one 
s+ ute” +D. (20.94) 


The notion that the ‘total current’ should be the sum of a hadronic and a 
leptonic part is already familiar from electromagnetism — see, for example, 
equation (8.91). 
The transition (20.94), for example, is the underlying process in semi- 
leptonic decays such as 
NX >n+e +0 (20.95) 


and 
K > T? +e +0 (20.96) 


as indicated in figure 20.3. 

The ‘s’ quark is assigned S = —1 and charge —ze. The s — u transi- 
tion is then referred to as one with ‘AS = AQ’, meaning that the change 
in the quark (or hadronic) strangeness is equal to the change in the quark 
(or hadronic) charge: both the strangeness and the charge increase by 1 unit. 
Prior to the advent of the quark model, and the Cabibbo hypothesis, it had 
been established empirically that all known strangeness-changing semileptonic 
decays satisfied the rules |AS| = 1 and AS = AQ. The u-s current in (20.90) 
satisfies these rules automatically. Note, for example, that the process appar- 
ently similar to (20.95), Ut > n+ e+ + ve, is forbidden in the lowest order (it 
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requires a double quark transition from suu to udd). All known data on such 
decays can be fit with a value sin c ~ 0.22 for the Cabibbo angle Oc. This 
relatively small angle is therefore a measure of the suppression of |AS| = 1 
processes relative to AS = 0 ones. 
The Cabibbo current can be written in a more compact form by introduc- 
ing the ‘mixed’ field 
d! = cos0cd + sin os. (20.97) 


Then 
an _ 3 (1 = 15) 7 
Jap (a, d, 8) = y GH. (20.98) 
In 1970 Glashow, Iliopuolos and Maiani (GIM) (1970) drew attention to 
a theoretical problem with the interaction (20.92) if used in second order. 
Now it is, of course, the case that this interaction is not renormalizable, as 
noted previously for the purely leptonic one (20.40), since Gp has dimensions 
of an inverse mass squared. As we saw in section 11.7, this means that one- 
loop diagrams will typically diverge quadratically, so that the contribution 
of such a second-order process will be of order (Gp.GpA?) where A is a cut- 
off, compared to the first-order amplitude Gp. Recalling from (20.46) that 
Gp ~ 107° GeV~?, we see that for A ~ 10 GeV such a correction could 
be significant if accurate enough data exists. GIM pointed out, in particular, 
that some second-order processes could be found which violated the (hitherto) 
well-established phenomenological selection rules, such as the |AS| = 1 and 
AS = AQ rules already discussed. For example, there could be AS = 2 
amplitudes contributing to the Ku — Kg mass difference (see Renton 1990, 
section 9.1.6, for example), as well as contributions to unobserved decay modes 
such as 


Kt sat+vt+o (20.99) 


which has a neutral lepton pair in association with a strangeness change for 
the hadron. In fact, experiment placed very tight limits on the rate for (20.99) 
— and still does: the branching fraction is (1.7 + 1.1) x 10719 (Nakamura et 
al. 2010). This seemed to imply a surprisingly low value of the cut-off, say 
~ 3 GeV (Mohapatra et al. 1968). 

Partly in order to address this problem, and partly as a revival of an 
earlier lepton-quark symmetry proposal (Bjorken and Glashow 1964), GIM 
introduced a fourth quark, now called c (the charm quark) with charge Ze. 
Note that in 1970 the 7-lepton had not been discovered, so only two lepton 
family pairs (ve, €), (Vz, 4) were known; this fourth quark therefore did restore 
the balance, via the two quark family pairs (u,d), (c,s). In particular, a 
second quark current could now be hypothesized, involving the (c,s) pair. GIM 
postulated that the c-quark was coupled to the ‘orthogonal’ d-s combination 
(cf (20.97)) 


al 


3 = —sinOcd + cos das. (20.100) 
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The complete four-quark charged current is then 


Jem (u, d, c, s) = i toe + ¿0 y (20.101) 
The form (20.101) had already been suggested by Bjorken and Glashow (1964). 
The new feature of GIM was the observation that, assuming an exact SU(4)£ 
symmetry for the four quarks (in particular, equal masses), all second-order 
contributions which could have violated the |AS| = 1,AS = AQ selection 
rules now vanished. Further, to the extent that the (unknown) mass of the 
charm quark functioned as an effective cut-off A, due to breaking of the SU(4)£ 
symmetry, they estimated me to lie in the range 3-4 GeV, from the observed 
Ky — Kg mass difference. 

GIM went on to speculate that the non-renormalizability could be over- 
come if the weak interactions were described by an SU(2) Yang-Mills gauge 
theory, involving a triplet (W+, W-, W°) of gauge bosons. In this case, it 
is natural to introduce the idea of (weak) SDR in terms of which the 
pairs (ve, e), (Va, n), (ud), (c, s’) are all t = 4 doublets with t3 = +3. 
Charge-changing currents then involve the Salie matrix 


1 f 0 1 
T} = 3 + ira) = ( 0 0 ) (20.102) 
and charge-lowering ones the matrix T- = (mı — i72)/2. The full symme- 
try must also involve the matrix 73, given by the commutator [71,7_] = 73. 


Whereas 74 and T_ would (in this model) be associated with transitions me- 
diated by W*, transitions involving 73 would be mediated by W°, and would 
correspond to ‘neutral current’ transitions for quarks. We now know that 
things are slightly more complicated than this: the correct symmetry is the 
SU(2) x U(1) of Glashow (1961), also invoked by GIM. Skipping therefore 
some historical steps, we parametrize the weak quark neutral current as (cf 
(20.86) for the leptonic analogue) 


= L= 1+ 7 
ON 5 ir 258) + e yy (20.103) 


q=u,c,d’,s’ 


for the four flavours so far in play. In the GSW theory, the cf 's are predicted 
to be 


1 2 2 
seta i (20.104) 
1 1 1 
GE = Es os za eo = za (20.105) 


where a = sin? Ow as before, and gn = g/ cos Ow. 
One feature of (20.103) is very important. Consider the terms 


tobe eats hee (20.106) 


308 20. Introduction to the Phenomenology of Weak Interactions 


It is simple to verify that, whereas either part of (20.106) alone contains a 


strangeness changing neutral combination such as d{...}8 or 3{...}d, such 
combinations vanish in the sum, leaving the result diagonal in quark flavour. 
Thus there are no first-order neutral flavour-changing currents in this model, 
a result which will be extended to three flavours in section 20.7.3. 

In 1974, Gaillard and Lee (1974) performed a full one-loop calculation 
of the Ky — Kg mass difference in the GSW model as extended by GIM 
to quarks and using the renormalization techniques recently developed by ’t 
Hooft (1971b). They were able to predict me ~1.5 GeV for the charm quark 
mass, a result spectacularly confirmed by the subsequent discovery of the ce 
states in charmonium, and of charmed mesons and baryons of the appropriate 
mass. 

In summary, then, the essential feature of the quark weak currents in 
the two-generation model is that they have the universal V-A form, but the 
participating fields are (ă, d), (é, s’) where d! and $ are not the fields d, 3 with 
definite mass, but rather are related to them by an orthogonal transformation: 


d _ cos  sin0a d 
( sI ) ~ ( —sinĝc cosa Ss) (20:107) 
In section 20.8 we shall enlarge this picture to three generations, where signif- 
icant new features occur, specifically CP violation. In chapter 22 we shall see 
how this transformation from the ‘mass’ basis to the ‘weak interaction’ basis 
arises via the gauge-invariant interactions of the Standard Model. 


20.7.2 Deep inelastic neutrino scattering 


We now have enough theory to present another illustrative calculation within 
the framework of the “current-current” model, this time involving neutrinos 
and quarks. We shall calculate cross sections for deep inelastic neutrino scat- 
tering from nucleons, using the parton model introduced (for electromagnetic 
interactions) in chapter 9. In particular, we shall consider the processes 


MIN > pw +X (20.108) 
DEN => pr+x (20.109) 


which of course involve the charged currents, for both leptons and quarks. 
Studies of these reactions at Fermilab and CERN in the 1970s and 1980s 
played a crucial part in establishing the quark structure of the nucleon, in 
particular the quark distribution functions. 

The general process is illustrated in figure 20.4. By now we are becoming 
accustomed to the idea that such processes are in fact mediated by the W*, 
but we shall assume that the momentum transfers are such that the W- 
propagator is effectively constant. The effective lepton-quark interaction will 
then take the form 


AR = ZE ia — ys 0 [dy (1 — 75) d + ¿y (1 — 95) 5], (20.110) 
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FIGURE 20.4 
Inelastic neutrino scattering from a nucleon. 


leading to expressions for the parton-level subprocess amplitudes which are 
exactly similar to that in (20.50) for vy, + e7 — pu + ve. Note that we are 
considering only the four flavours u, d, c, s to be ‘active’, and we have set 
Ac = 0. 

As in (20.53), the 1,, cross section will have the general form 


do” ax Ny WES (4, p) (20.111) 


where N, is the neutrino tensor of (20.67). The form of the weak hadron 
tensor we > is deduced from Lorentz invariance. In the approximation of 
neglecting lepton masses, we can ignore any dependence on the 4-vector q 
since 

eN y = Ny =0. (20.112) 


Just as N, contains the pseudotensor €ag so too will WH; > since parity is 


not conserved. In a manner similar to equation (9.10) for the case of electron 
scattering, and following the steps that led from (20.67) to (20.72), we define 
effective neutrino structure functions by 


v v v 1 v v i v v 
wE = (og) + IPP"? wP — zp" Yp ga W£. (20.113) 
In general, the structure functions depend on two variables, say Q? and v, 
where Q? = —(k — k')? and v = p-q/M; but in the Bjorken limit approximate 
scaling is observed, as in the electron case: 


Q? — co _ p72 
peak x=Q*/2Mv fixed (20.114) 
yWi(Q?,v) > EP (a) (20.115) 
2 ) 2 î 
MWW(Q2,v) > EW (2) (20.116) 
VWE (Q? v) > F(x) (20.117) 


where, as with (9.21) and (9.22), the physics lies in the assertion that the F’s 
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are finite. This scaling can again be interpreted in terms of pointlike scattering 
from partons — which we shall take to have quark quantum numbers. 

In the ‘laboratory’ frame (in which the nucleon is at rest) the cross section 
in terms of W1, W2 and W3 may be derived in the usual way from (cf equation 
(9.11)) 


7 Gene” A » dk 
do” = (Se) MN an (20.118) 


In terms of “laboratory” variables, one obtains (problem 20.9) 
do. GZK d 
dQ2dv 27 k 


k 
(ws cos*(9/2) + W{”2sin?(6/2) + EEE sin?(0/ aw) 


(20.119) 
For an incoming antineutrino beam, the W3 term changes sign. 
In neutrino scattering it is common to use the variables x,v and the ‘in- 
elasticity’ y where 
y=p-q/p-k. (20.120) 


In the ‘laboratory’ frame, v = E— E” (the energy transfer to the nucleon) and 
y =v/E. The cross section can be written in the form (see problem 20.9) 


2 -(v) 2 „1 1—y)? ve (Le 2 
ci = CE (ni pane v) + oF! poo y) ) (20.121) 


in terms of the Bjorken scaling functions, and we have assumed the relation 
20M = El” (20.122) 


appropriate for spin—$ constituents. 

We now turn to the parton-level subprocesses. Their cross sections can be 
straightforwardly calculated in the same way as for v,e~ scattering in section 
20.5. We obtain (problem 20.10) 


des Gr q? 
peo i a 20.12 
vq, Va ded = 8x6 (e as} (20.123) 
i i do G2 2 Q? 


The factor (1 — y)? in the vq, Pq cases means that the reaction is forbidden 
at y = 1 (backwards in the CM frame). This follows from the V-A nature of 
the current, and angular momentum conservation, as a simple helicity argu- 
ment shows. Consider for example the case vq shown in figure 20.5, with the 
helicities marked as shown. In our current-current interaction there are no 
gradient coupling terms and therefore no momenta in the momentum-space 
matrix element. This means that no orbital angular momentum is available 
to account for the reversal of net helicity in the initial and final states in figure 
20.5. The lack of orbital angular momentum can also be inferred physically 
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y = s 
«o — — — 
(a) (b) 
FIGURE 20.5 


Suppression of 1,q — u q for y = 1: (a) initial state helicities; (b) final state 
helicities at y = 1. 


from the ‘pointlike’ nature of the current-current coupling. For the vq or vq 
cases, the initial and final helicities add to zero, and backward scattering is 
allowed. 

The contributing processes are 


vd > lu pd > Itt (20.125) 


, Das ltd, (20.126) 


I- 
I- 


al 


vu > 


the first pair having the cross section (20.123), the second (20.124). Following 
the same steps as in the electron scattering case (sections 9.2 and 9.3) we 
obtain 


FiP = FR" =2xl[d(x)+u(x)] (20.127) 
Fy? = FX” =2[d(x) — u(z)] (20.128) 
FR? = Fy? =2x[u(x) + d(x) (20.129) 
pm = FẸ =2lu(x)- d(x)]. (20.130) 
Inserting (20.127) and (20.128) into (20.121), for example, we find 
d2o(P) > 
Jody = 20oaţd(a) + (1 — Pate) (20.131) 
where a 5 
ME 
09 = ces = om ~ 1.5 x 10 %(B/GeV)m? (20.132) 


is the basic ‘pointlike’ total cross section (compare (20.83)). Note the small 
magnitude of this cross section, as compared with the electromagnetic one of 
equation (B.18) in volume 1, which was o = Wea) x 10737m?. Similarly, 


one finds 


d2g PP) 
= 2091 [(1 — y) u(x) + d(x)). (20.133) 


dady 


The corresponding results for vn and yn are given by interchanging u(x) and 
d(x), and u(x) and d(x). 
The target nuclei usually have high mass number (in order to increase the 


cross section), with approximately equal numbers of protons and neutrons; it 
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is then appropriate to average the ‘n’ and ‘p’ results to obtain an ‘isoscalar’ 
cross section oN) or oN): 


d2g N) 

da ee ll yYa(w)] (20.134) 
d2g PN) 

dady oox|(1 — y)a(z) + (2) (20.135) 


where g(x) = u(x) + d(x) and q(x) = u(x) + d(x). 
Many simple and striking predictions now follow from these quark parton 
results. For example, by integrating (20.134) and (20.135) over x we can write 


(VN) be 

re = lQ + (1 —y)29) (20.136) 
(ON) > 

T = olti-1*0+0) (20.137) 


where Q = f xq(x)dz is the fraction of the nucleon’s momentum carried by 
quarks, and similarly for Q. These two distributions in y (‘inelasticity dis- 
tributions’) therefore give a direct measure of the quark and antiquark com- 
position of the nucleon. Figure 20.6 shows the inelasticity distributions as 
reported by the CDHS collaboration (de Groot et al. 1979), from which the 
authors extracted the ratio 


Q/(Q+Q) = 0.15 + 0.03 (20.138) 


after applying radiative corrections. An even more precise value can be ob- 
tained by looking at the region near y = 1 for YN which is dominated by Q, 
the small Q contribution (x (1 — y)?) being subtracted out using vN data at 
the same y. This method yields 


Q/(Q +Q) =0.15 + 0.01. (20.139) 


Integrating (20.136) and (20.137) over y gives 


1. 
oN) = (Q+ 39) (20.140) 
EA 1 = 
JN = do(30+ 0) (20.141) 
and hence 
Q+ Q = 3(0 N + oM) /409 (20.142) 
while 


Q/10+0) = I (==) (20.143) 
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FIGURE 20.6 

Charged-current inelasticity (y) distribution as measured by CDHS; figure 
from K Winter (2000) Neutrino Physics 2nd edn, courtesy Cambridge Uni- 
versity Press. 


where r = oN) /o(N). From total cross section measurements, and including 
c and s contributions, the CHARM collaboration (Allaby et al. 1988) reported 


Q+ 
Q/Q+ 


The second figure is in good agreement with (20.139), and the first shows that 
only about 50% of the nucleon momentum is carried by charged partons, the 
rest being carried by the gluons, which do not have weak or electromagnetic 
interactions. 

Equations (20.140) and (20.141), together with (20.132), predict that the 
total cross sections o”N and oN rise linearly with energy E. This (parton 
model) prediction was confirmed as early as 1975 (Perkins 1975), soon after 
the model’s success in deep inelastic electron scattering; later data is included 
in figure 20.7. In fact, both o"N/E and o’N/E are found to be independent 
of E up to E ~ 350 GeV (Nakamura et al. 2010). 

Detailed comparison between the data at high energies and the earlier data 
of figure 20.7 at E, up to 15 GeV reveals that the Q fraction is increasing 
with energy. This is in accordance with the expectation of QCD corrections 
to the parton model (section 15.6): the Q distribution is large at small z, 


0.492 + 0.006(stat) + 0.019(syst) (20.144) 
0.154 + 0.005(stat) + 0.011 (syst). (20.145) 


DD 
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o(1078 em*/nucleon) 


FIGURE 20.7 
Low-energy v and y cross-sections; figure from K Winter (2000) Neutrino 
Physics 2nd edn, courtesy Cambridge University Press. 


and scaling violations embodied in the evolution of the parton distributions 
predict a rise at small x as the energy scale increases. 

Returning now to (20.127)-(20.130), the two sum rules of (9.65) and (9.66) 
can be combined to give 


a= f Hidra 1) (20.146) 
0 
= | dz( PP + FY?) (20.147) 
= ji EN (20.148) 
0 


which is the Gross—Llewellyn Smith sum rule (1969), expressing the fact that 
the number of valence quarks per nucleon is three. The CDHS collaboration 
(de Groot et al. 1979), quoted 


1 
Icus = i def? = 3.2 4 0.5. (20.149) 
0 


In perturbative QCD there are corrections expressible as a power series in aş, 
so that the parton model result is only reached as Q? — 00: 


Ietis(Q?) = 3[1 + dias/m + dga?/n? +... (20.150) 
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FIGURE 20.8 

CCFR neutrino-iron structure functions oF” (Shaevitz et al. 1995). The 
solid line is the next-to-leading order (one-loop) QCD prediction, and the 
dotted line is an extrapolation to regions outside the kinematic cuts for the 
fit. 


where dı = —1 (Altarelli et al. 1978a, 1978b), da = —55/12 + N¢/3 (Gorish- 
nii and Larin 1986) where Ne is the number of active flavours. The CCFR 
collaboration (Shaevitz et al. 1995) measured Jag iis in antineutrino-nucleon 
scattering at (Q?) ~ 3GeV?. They obtained 


Tatts ((Q?) = 3 GeV?) = 2.50 + 0.02 + 0.08 (20.151) 


in agreement with the O(a) calculation of Larin and Vermaseren (1991) using 
AI = 250 + 50MeV. 

The predicted Q? evolution of xF; is particularly simple since it is not 
coupled to the gluon distribution. To leading order, the xF} evolution is 
given by (cf (15.109)) 


d A 2 1 d 
dln Q? (zF; (z,Q°)) = S | Paq(z)aFs (2,02) a (20.152) 


Figure 20.8, taken from Shaevitz et al. (1995) shows a comparison of the 
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CCFR data with the next-to-leading order calculation of Duke and Owens 
(1984). This fit yields a value of a, at Q? = MZ given by 


as(Mz) = 0.111 + 0.002 + 0.003. (20.153) 


The Adler sum rule (Adler 1963) involves the functions Fy? and Fy?: 
1 
dz _- 
Ke i (re): (20.154) 
0 IL 
In the simple model of (20.127)-(20.130), the right-hand side of fa is just 


2 / dx(u(x) + d(x) — d(x) — u(=)) (20.155) 
0 


which represents four times the average of 13 (isospin) of the target, which is 
4 for the proton. This sum rule follows from the conservation of the charged 
weak current (as will be true in the Standard Model, since this is a gauge 
symmetry current, as we shall see in the following chapter). Its measurement, 
however, depends precisely on separating the non-isoscalar contribution (IA 
vanishes for the isoscalar average ‘N’). The BEBC collaboration (Allasia et al. 
1984, 1985) reported: 


IA = 2.02 + 0.40; (20.156) 


in agreement with the expected value 2. 
Relations (20.127)—(20.130) allow the F> functions for electron (muon) and 
neutrino scattering to be simply related. From (9.58) and (9.61) we have 


1 5 Se etil 
BSS = aia: + FS") = gu +u+d4+d)+ gr(s +5) +... (20.157) 
while (20.127) and (20.129) give 
1 _ 
FN = sha ES) = x(u+d+4+d). (20.158) 


Assuming that the non-strange contributions dominate, the neutrino and 
charged lepton structure functions should be approximately in the ratio 18/5, 
which is the reciprocal of the mean squared charged of the u and d quarks 
in the nucleon. Figure 20.9 shows the neutrino results on Fə and zF} to- 
gether with those from several uN experiments scaled by the factor 18/5. The 
agreement is satisfactory for a tree-level parton model calculation. 

From (20.127)-(20.130) we see that FIN —2FYN = 2a(u+d), which is just 
the sea distribution; figure 20.9 shows that this is concentrated at small x, as 
we already inferred in section 9.3. 

We have mentioned QCD corrections to the simple parton model at several 
points. Clearly the full machinery introduced in chapter 16, in the context 
of deep inelastic charged lepton scattering, can be employed for the case of 
neutrino scattering also. For further access to this area we refer to Ellis et al. 
(1996), chapter 4, and Winter (2000) chapter 5. 
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FIGURE 20.9 

Comparison of neutrino results (experiments CCFRR, CDHSW and CHARM) 
on F(x) and «F3(a) with those from muon production (experiments BCDMS, 
BFP and EMC) properly rescaled by the factor 18/5, for a Q? ranging between 
10 and 1000 GeV’; figure from K Winter (2000) Neutrino Physics 2nd edn, 
courtesy Cambridge University Press. 


20.7.3 Three generations 


We have seen in section 20.2.2 that the V-A interaction violates both P and 
C, and that it conserves CP in interactions with massless neutrinos. But 
we know (section 4.2.3) that CP-violating transitions occur, among states 
formed from quarks in the first two generations, albeit at a very slow rate. Is 
it possible, in fact, to incorporate CP-violation with only two generations of 
quarks? 

To answer this question, we need to go back and examine the CP-transfor- 
mation properties of the interactions in more detail. Rather than work with 
the current-current form, which is after all only an approximation valid for 
energies much less than My z, we shall look at the actual gauge interactions 
of the electroweak theory. Given the form of those interactions, we want to 
know the condition for CP-violation to be present. 

Consider then the particular interaction involved in u + d transitions: 


Vand, + Vidya Wi — Vaag” gs Wp — Vigda” ys Wi, — (20.159) 


where W,, = (Wj —iW?)/V2 destroys the W* or creates the W~. We have 
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written out the Hermitian conjugate terms explicitly, keeping the coupling 
Via complex for the sake of generality, and separating the vector from the 
axial vector parts. Problem 20.11 shows that the different parts of (20.159) 
transform under C as follows (normal ordering being understood in all cases): 


C: în > —dy"t, Tyd > +dy est, (20.160) 


and we also know that under C, W,, > —W! (the dagger is as in the charged 


scalar field case, and the minus sign is as in the photon Â, case). Hence under 
C, (20.159) transforms into 


Vaad iW + VÂ ÂW, + Vaada yâ Wi + Vint" ysdW,. (20.161) 


Under P, W,, behaves like an ordinary four-vector, so the ‘vector-vector’ prod- 
ucts in (20.161) are even under P, while the ‘axial vector-vector’ products are 
odd under P. Thus finally, under the combined CP transformation (20.159) 
becomes 


Vuady"a wi + Vând W,, — Vaad ysů Wi — Váy" yâ Wp. (20.162) 


Comparing (20.159) with (20.162) we deduce the essential result that this 
interaction conserves CP if and only if 


Vaa = Vis (20.163) 


that is, if the coupling is real. The same is true for all the other couplings Vi;. 
The couplings we have introduced in this chapter so far only involve the 
real Fermi constant Gp, and the elements of the Cabibbo-GIM matrix which 


enters into the relation between the weakly interacting fields (d’, 8’) and the 
fields with definite mass (d, 3): 


d! cos%a sinc d\ _ d 
( ES ) ~ ( —sin@c cosc ) ( 3 ) = Voom ( 8 ), ore) 
All these couplings are plainly real. But could we perhaps parametrize the 
(d, 8) > (d, 8) differently, so as to smuggle in a complex, CP-violating, 
coupling? 

This is the question that Kobayashi and Maskawa asked themselves in 1972 
(Kobayashi 2009, Maskawa 2009). To answer it is not completely straightfor- 
ward, because we can always change the phases of the quark fields by inde- 
pendent constant amounts. A rephasing of the quark fields in the transition 
i o j with coupling Vj; changes Vi; by the phase factor expi(aj — aj). We 
need to know whether, after allowing for this rephasing of the quark fields, an 
‘irreducible’ complex coupling can remain. 

First of all, note that the matrix Vcc appearing in (20.164) is or- 
thogonal, and this property guaranteed the vanishing of tree-level neutral 
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strangeness-changing transitions, as we saw after (20.106). But this could 
just as well be achieved if the matrix was unitary. Now a general 2 x 2 matrix 
has 8 real parameters; unitarity gives 2 real conditions from the diagonal ele- 
ments of Voom Viem = I, and one complex condition from the off-diagonal 
elements, leaving four real parameters. If all the elements are taken to be 
real from the beginning, the matrix becomes orthogonal, as in (20.164), and 
depends on only one real parameter, the ‘rotation’ in the 2-dimensional d—§ 
space. So in the general, unitary case, the matrix will have one real angle 
parameter, and three phase parameters. But we have four quark fields whose 
phases we can adjust. In fact, since only phase differences enter, we really only 
have three free phases at our disposal, but that is just enough to transform 
away the three phases in the unitary version of Vccrm, leaving it in the real 
orthogonal form (20.164). Kobayashi and Maskawa therefore concluded that 
the 2-generation GIM-type theory could not accommodate CP-violation. 

In a step which may seem natural now but was very bold in 1972, they 
decided to see if there was room for CP-violation in a 3-generation model. 
(Remember that there was no sign of any third generation particles at that 
time.) The matrix transforming from the mass basis to the weak basis is 
now a 3 x 3 unitary matrix V, with 18 real parameters. There are three real 
diagonal conditions from unitarity, and three complex off-diagonal conditions, 
leaving 9 real parameters. If the elements of V are taken to be real, one has an 
orthogonal (rotation) matrix, which can be parametrized by three real Euler 
angles. That leaves 6 phase parameters in the general unitary V. We also 
have 6 quark fields, with 5 phase differences which can be adjusted. Thus 
just one irreducible phase degree of freedom can remain in V, after quark 
rephasing. Consequently, the three-generation model naturally accommodates 
CP-violation in the quark sector: this was the great discovery of Kobayashi 
and Maskawa (1973). It was another four years before the existence of the b 
quark was established, and more than twenty before the t quark was produced. 

The 3-generation matrix V, written out in full, is 


Vaa Vis Vab 
V= Vea Ves Veb , (20.165) 
Via Vis Vib 


and is called the CKM matrix, after Cabibbo, Kobayashi, and Maskawa. 
Clearly, there is no unique parametrization of V. One that has now become 
standard (Nakamura et al. 2010) is (Chau and Keung 1984) 


—ió 


C12C13 $12C13 513€ 
= 16 id 
V = | —s12c23 — C12523513€' €12€23 — 5128238513 $23C13 
5 x 
512523 — C12C23513€' —C125823 — $12C235130'"  C23C13 


(20.166) 
where cj; = cos fij, Sij = sin 0, with i, j = 1,2, 3; the 6;; may be thought of as 
the three Euler angles in an orthogonal V, and 6 is the remaining irreducible 
CP-violating phase. In the limit 013 = 023 = 0, this CKM matrix reduces to 
the Cabibbo-GIM matrix with 012 = 0c. 
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However, it would also be desirable to have a measure of CP-violation 
that was independent of quark rephasing. Consider one of the off-diagonal 
unitarity conditions, 


Vaa Vab + Vea Vab + Vta Vip = 0. (20.167) 


(Note that the complex conjugate of this equation gives another, independent, 
condition.) The best-measured of these products is V¿aV¿,; dividing by this 
quantity, (20.167) can be written as 


where pa Ve 
td Ytb ud “ub 

= = . 20.169 

va Yas en 


When viewed in the complex plane, relation (20.168) is the statement that the 
vectors (1,0), zı and 22 close to form a triangle as shown in figure 20.10, one 
of 6 such unitarity triangles that can be formed. The area A of this triangle 
is 

1 1 Vaa VA Vi Vip 

A==I 1) = -Im | 22). 20.170 

2 m(z227) 2 m( [Vea |2| Ve]? ( ) 
Recalling that a rephasing multiplies Vj by expi(a; — aj), we see that A is 
rephasing invariant; in particular, so is the numerator J where 


J = Im(Vaa Va VÄ VA) (20.171) 


is a Jarlskog invariant (Jarlskog 1985). J may be thought of as follows: (i) 
strike out the ‘c’ row and ‘s’ column of V; (ii) take the complex conjugate of 
the off-diagonal elements in the 2 x 2 matrix that remains; (iii) multiply the 
four elements and take the imaginary part. There is nothing special about 
this particular row and column: there are nine different ways of choosing to 
pair one row with one column, but all such Js are equal up to a sign, because 
of the unitarity of V. In the parametrization (20.166), J takes the form 


J= C12512093523C13813 sin ô, (20.172) 


which vanishes if any 6;; = 0, or 7/2, or if ô = 0 or 7. 

The CKM matrix is an integral part of the Standard Model, and testing 
its validity is an important experimental goal. Various tests are possible. 
Consider first the magnitudes of the CKM elements. These must satisfy six 
relations following from the unitarity of V: namely, the sum of the squares 
of the absolute values of the elements of each row, and of each column, must 
add up to unity. 

The magnitudes of the six elements of the first two rows have been de- 
termined from measurements of semileptonic decay rates: for example, the 
amplitude for the tree-level process d > u + e” + % is proportional to Vaa- 
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But non-perturbative strong interaction effects enter into the amplitudes for 
corresponding measured hadronic transitions, such as n > p +e% + De or 
m >mW+e7 +0. In many cases these hadronic factors in the matrix 
elements can now be calculated by unquenched lattice QCD. 

The status of the experimental determination of the moduli |V;;| is regu- 
larly reviewed by the Particle Data Group. The current results for the uni- 


tarity checks are (Ceccucci et al. 2010) 


|Vaal? + |Vasl? + [Van]? = 0.9999 + 0.0006 (20.173) 
[Veal? + |Ves|? + [Ven]? = 1.101 + 0.074 (20.174) 
[Vaal? + |Veal? + [Va]? = 1.002 + 0.005 (20.175) 
[Vasl? + |Ves|? + [Vis]? = 1.098 + 0.074. (20.176) 


Evidently these results are fully consistent with the CKM prediction of uni- 
tarity. 

The most accurate values of the nine magnitudes are obtained by a global 
fit to all the available measurements, imposing the constraints of 3-generation 
unitarity. The current result for the magnitudes, imposing these constraints, 
is (Ceccucci et al. 2010) 


0.9428 + 0.00015 0.2253 + 0.0007  0.00347+0:00016 
V= | 1022252: 0.0007 0.973455 oie 0.0410 0.000 , (20.177) 


0.00862+9:00026  0.0403+9-011  0,999152+9-000030 


and the Jarlskog invariant is J = (2.91%0:19) x 1075. 
From (20.177) it follows that the mixing angles are small, and moreover 
satisfy a definite hierarchy 


1 > 012 > 023 > 013. (20.178) 


In more physical terms, hadrons evidently prefer to decay semileptonically 
to the nearest generation. Also, because the elements Vab, Veb, Vsa and Vis, 
which connect the third generation to the first two, are all quite small, the 
physics of the first two generations is hardly influenced by the presence of the 
third. This reflects, in quantitative terms, the success of the Cabibbo-GIM 
description, and the fact that the CP-violation seen in the K-meson sector 
is so weak. CP-violation is much more visible in B physics, as Carter and 
Sanda (1980, 1981) were the first to suggest, and as we shall discuss in the 
following chapter. 

Consider now the complex-valued off-diagonal unitarity conditions, in par- 
ticular the condition (20.168). Following Wolfenstein (1983), we identify s12 as 
the small parameter A, and write Voy = s23 = AA? and Vap = s13exp(—id) = 
AM(p — in) with A œ 1 and |p — in| < 1. This gives 

1— 2/2 A AM (p — in) 
V= =A 1— X2/2 AX y (20.179) 
Ad\R(1—p—in)  —A»M 1 
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(p, 7) 


(0, 0) 


FIGURE 20.10 
The unitarity triangle represented by (20.168). 


neglecting terms of order A* and higher. Then 


i ua Va ; 
CUNN p+in—1, 2= VadVab —(p+in), J = An. (20.180) 
Vea Vo 


ZI 
Vea Vo, 


The unitarity triangle represented by the condition (20.168), or alternatively 
—21 — 22 = 1, is therefore a triangle on the base (1,0), with sides p + in and 
1— (p+in). Buras et al. (1994) showed that including terms up to order A? 
changes (p,n) to (p, 7) where p = (1—A?/2)p, 7 = (1—A?/2)n. The top vertex 
of the triangle in figure 20.10 is therefore at the point (p, 7). The angles a, 8 
and y (also called ¢2, 61 and ¢3) are defined by 


E o at 
a = po = arg (ata) x arg — (22) (20.181) 


Vaa Vi p+in 
a) ( ) 
= = arg | — E = arg | —————_ 20.182 
p= 0 =are( AMA are 
Vaa V3 ae 
y= ¢3 = arg | -= ) x arg(p + iñ) (20.183) 
Vea Veb 


The sides of this triangle are determined by the magnitudes of the CKM 
elements, and so another check is provided by the condition that the three 
sides should close to form a triangle. Further independent constraints are 
provided by measurements of the angles a, 8, and y which are directly related 
to CP-violation effects, as we shall discuss in the following chapter. Figure 
20.11 shows a plot of all the constraints in the p, 7 plane from many different 
measurements (combined following the approach of Charles et al. 2005 and 
Hocker et al. 2001), and the global fit, as presented by Ceccucci et al. (2010). 
The annular region labelled by |Va»| represents, for example, the uncertainty 
in the determination of |z2| = |Vaa Vä / Vea Vä |, which is principally due to the 
uncertainty in |Vab|. The region labelled by Ama represents the constraint on 
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FIGURE 20.11 

Constraints in the p, 7 plane. The shaded areas have 95% CL. [Figure repro- 
duced, courtesy Michael Barnett for the Particle Data Group, from the review 
of the CKM Quark-Mixing Matrix by A Ceccucci, Z Ligeti and Y Sakai, sec- 
tion 11 in the Review of Particle Physics, K Nakamura et al. (Partcle Data 
Group) Journal of Physics G 37 (2010) 075021, IOP Publishing Limited. 
(See color plate III.) 


lz1| = [Via VE /VeaV,3,|, where |Via| is deduced from the value of the BO — BO 
mass difference Ama measured in BO — BO oscillations mediated by top-quark 
dominated box diagrams (see section 21.2.1 in the following chapter); here 
the uncertainties are dominated by lattice QCD. Figure 20.11 represents an 
enormous experimental effort, especially in the decade 2000-2010. The 95% 
CL regions all overlap consistently. It is quite remarkable how the single 
CP- violating parameter, three-generation scheme of Kobayashi and Maskawa 
(1973) has withstood this searching test. 
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FIGURE 20.12 
Effective four-fermion non-leptonic weak transition at the quark level. 


FIGURE 20.13 
Non-leptonic weak decay of A° using the process of figure 20.12, with the 
addition of two ‘spectator’ quarks. 


a a 


20.8  Non-leptonic weak interactions 


The CKM 6-quark charged weak current, which replaces the GIM current 
(20.101), is 
A = y (1- ~ = ,(1- = ,(1-—5)¢ 
Jesu d8,¢,t,b) = Hy IE + pr LO) + ap Wy, 
(20.184) 
and the effective weak Hamiltonian of (20.92) (as modified by CKM) clearly 
contains the term a 
$ F^ A 
Héc(2) = atea „cre () (20.185) 
in which no lepton fields are present (just as there are no quarks in (20.40)). 
This interaction is responsible, at the quark level, for transitions involving 
four quark (or antiquark) fields at a point. For example, the process shown in 
figure 20.12 can occur. By ‘adding on’ another two quark lines u and d, which 
undergo no weak interaction, we arrive at figure 20.13, which represents the 


non-leptonic decay A} > pn”. 
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This figure is, of course, rather symbolic since there are strong QCD 
interactions (not shown) which are responsible for binding the three-quark 
systems into baryons, and the qq system into a meson. Unlike the case of 
deep inelastic lepton scattering, these QCD interactions can not be treated 
perturbatively, since the distance scales involved are typically those of the 
hadron sizes (~ 1 fm), where perturbation theory fails. This means that 
non-leptonic weak interactions among hadrons are difficult to analyze quan- 
titatively, though progress can be made via lattice QCD. Similar difficulties 
also arise, evidently, in the case of semi-leptonic decays. In general, one has 
to begin in a phenomenological way, parametrizing the decay amplitudes in 
terms of appropriate form factors (which are analogous to the electromagnetic 
form factors introduced in chapter 8). In the case of transitions involving at 
least one heavy quark Q, Isgur and Wise (1989, 1990) noticed that a consid- 
erable simplification occurs in the limit ma — oo. For example, one universal 
function (the ‘Isgur—Wise form factor’) is sufficient to describe a large number 
of hadronic form factors introduced for semi-leptonic transitions between two 
heavy pseudoscalar (07) or vector (17) mesons. For an introduction to the 
Isgur—Wise theory we refer to Donoghue et al. (1992). 

The non-leptonic sector is, however, the scene of some very interesting 
physics, such as K% — KO and BO — BO oscillations, and CP violation in the 
K? — K®, D° — DO and B° — BO systems. We shall discuss these phenomena in 
the following chapter. 


SSS E Mii 


Problems 


20.1 Show that in the non-relativistic limit (|p| < M) the matrix element 
Upy"Un of (20.2) vanishes if p and n have different spin states. 


20.2 Verify the normalization N = (E + |p|)'/? in (20.23). 
20.3 Verify (20.30) and (20.31). 
20.4 Verify that equations (20.32) are invariant under CP. 


20.5 The matrix ys is defined by ys = iy°y!y273. Prove the following prop- 


erties: 


(a) 72 = 1 and hence that 
(1+ 75)(1 — 5) = 0; 
(b) from the anticommutation relations of the other y matrices, show that 


lv, Yn} = 0 
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(c) 
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and hence that 
(1 +75) y0 = (1 — y5) 


and 
(1+75)70% = 10 Y (1 + 75). 


Consider the two-dimensional antisymmetric tensor €;; defined by 
€12 = +1, €21 =-—1, €11 = €22 = Q. 


By explicitly enumerating all the possibilities (if necessary), convince 
yourself of the result 


€ij€kt = +1(dindjt — Ó1051)- 
Hence prove that 
EijEil = djl and Eij€ij = 2 


(remember, in two dimensions, » >; ĝi; = 2). 
By similar reasoning to that in part (a) of this question, it can be shown 
that the product of two three-dimensional antisymmetric tensors has 
the form 
EijkElmm =| Ojt Ojm Ôj 
Oki Okm Okn 


Prove the results 


dim Gin 


EijkEijn = 20%n EijkEijk = 3! 
Om Okn 7 J f 


EijkEimn = | 


Extend these results to the case of the four-dimensional (Lorentz) 
tensor Euvap (remember that a minus sign will appear as a result of 
€0123 = +1 but 1 — —1). 


20.7 Starting from the amplitude for the process 


Yate >H +V 


given by the current-current theory of weak interactions, 


M = ~i(Gp/V2)a(H) yu (1 — Y5)u(ru)g"” (ve) (1 — 75)u(o), 


verify the intermediate results given in section 20.5 leading to the result 


do/dt = G2/n 
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(neglecting all lepton masses). Hence show that the local total cross section 
for this process rises linearly with s: 


o = Gis/r. 


20.8 The invariant amplitude for + — ety decay may be written as (see 
(18.52)) 

M = (Gr Vaa) fap uv) yn (1 — 5) v(e) 
where p is the 4-momentum of the pion, and the neutrino is taken to be 


massless. Evaluate the decay rate in the rest frame of the pion using the 
decay rate formula 


T = (1/2m,)|M|?dLips(m2; ke, ky). 
Show that the ratio of tt > etv and rt > u*v rates is given by 
T(mt—etv) — >y (=: — a 
(at > utv) Im, mm?) ` 
Repeat the calculation using the amplitude 


M! = (GF Vaa) frp Uv) yu (9v + gays ole) 


and retaining a finite neutrino mass. Discuss the e+ /u™ ratio in the light of 
your result. 


20.9 


(a) Verify that the inclusive inelastic neutrino-proton scattering differen- 
tial cross section has the form 
do) GER fi Gi re 
a oe (ws ) cos?(0/2) + W."2 sin? (0/2) 
k+k' 
eek) 


sin?(o/2)W9? 


in the notation of section 20.7.2. 


(b) Using the Bjorken scaling behaviour 
vWP > FP MWP SEN IS 


rewrite this expression in terms of the scaling functions. In terms of 
the variables z and y, neglect all masses and show that 
2 
do G? 


dy AO) AP ay? EA y/2)ya]. 


Remember that 
k' sin”(0/2) xy 


M 2 
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(c) Insert the Callan—Gross relation 
nF) = El” 


to derive the result quoted in section 20.7.2: 


dedy 27? 2 EN 2 


20.10 The differential cross section for v, q scattering by charged currents 
has the same form (neglecting masses) as the 1,,e” — pu Ve result of problem 
20.7, namely 

do G2 
re 

(a) Show that the cross section for scattering by antiquarks 1, q has the 

form 


do, _ G2 2 
ay (ră) = eae =y)“. 


(b) Hence prove the results quoted in section 20.7.2: 


do 


Gh 2 
a = — sad (x — Q“/2Mv) 


and 2 ca 
Cos F 2 2 
<L vg) Se eat — Q?/2M 
Erta) = SE sat — Poe — Q2/2Mv) 
(where M is the nucleon mass). 


(c) Use the parton model prediction 


to show that 


and 
zF (x) q(x) — (a) 
F(z) a(x) +4 


20.11 Verify the transformation laws (20.160). 
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CP Violation and Oscillation Phenomena 


In this chapter we shall continue with the phenomenology of weak interactions, 
introducing two topics which have been the focus of intense experimental effort 
in the recent decade: CP violation in B meson decays, and oscillations in both 
neutral meson and neutrino systems. In the following chapter we take up again 
the gauge theory theme, with the Glasow-Salam- Weinberg electroweak theory. 

CP violation was first discovered in the decays of neutral K mesons (Chris- 
tenson et al. 1964), but we shall not follow a historical approach to this sub- 
ject. Instead we shall concentrate on B-meson decays, where the effects are 
far larger, and much clearer to interpret theoretically than in the K-meson 
case. CP violation is reviewed in Branco et al. (1999), Bigi and Sanda (2000) 
and Harrison and Quinn (1998). We aim simply to illustrate the principles 
with some particular examples. In particular, we shall generally not discuss 
theoretical predictions; our main emphasis will be on describing selected ex- 
periments which have allowed determinations of the angles a, 8 and y of the 
unitarity triangle, figure 20.10. 

We saw in section 20.7.3 that, in the Standard Model, CP violation is 
attributable solely to one irreducible phase degree of freedom, 6, in the CKM 
matrix V. Clearly, to measure this phase, it is necessary (as usual in quantum 
mechanics) to create situations where it enters into the interference between 
two complex amplitudes. Two situations may be distinguished (Carter and 
Sanda 1980): 


(i) interference between two decay amplitudes B° > X and B® > X, 
where the BO and BO have been produced in a coherent state by mixing, 
and decay to a common hadronic final state X; 


(ii) interference between two different amplitudes for a single B-meson to 
decay to a final state X. 


Method (ii) (‘direct CP violation’) can be applied to charged as well as neutral 
mesons. 

The mixing in method (i) is formally similar to that involved in neutrino 
oscillations, which we treat after the meson case. We shall therefore start in 
section 21.1 with an example illustrating method (ii). We set up the mixing 
formalism and apply it to CP violation in B decays in section 21.2; we discuss 
K decays in section 21.3. Neutrino oscillations are treated in section 21.4. 
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Vag Kt 


FIGURE 21.1 _ 
Tree diagram contribution to B? > K+r” via the quark transition b > sui. 


wr 5 Kt 


FIGURE 21.2 _ 
Penguin diagrams (f = ú,€, 
transition b > sui. 


ct! 


) contributing to BO > Ktm” via the quark 


E a 


21.1 Direct CP violation in B decays 


Consider the decays 
Bo>Kta and BoKr. (21.1) 


The first of these can proceed via the quark transitions shown in figure 21.1, 
which (in parton-like language) is a ‘tree-diagram’. Of course, long-distance 
strong interaction effects will come into play in forming the hadronic states 
BO, K+ and 77, and in final state interactions between the Kt and m7; we 
do not represent these strong interactions in figure 21.1, or in subsequent 
similar diagrams. We are specifically interested in the weak phase of figure 
21.1, since it is this quantity which changes sign under the CP transformation 
(Vi; > V;;), and this phase change will lead to observable CP violation effects. 
By contrast, the strong interaction phases — which will play an important role 
— will be CP invariant, but we do not need to display them yet. So we write 
the amplitude for figure 21.1 as 


Ar(BO = KT”) = ab Vus ta, (21.2) 


where the CKM couplings have been displayed. 
There are, however, three order-a loop corrections to figure 21.1, shown 
in figure 21.2, where f = ú,c and t. We write the amplitude for the sum of 
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these three ‘penguin’ diagrams as 
Ap(B° > K*27) = Vas Vup Pa + Ves Vi pe + Vis Vi Pr, (21.3) 


where p; is the penguin amplitude with f in the loop. It is convenient to use 
a unitarity relation to rewrite VisV,j, in terms of the other two related CKM 
products: 


VaVă = Vas — Vos Va, (21.4) 
so that the total amplitude becomes 
A(B > KTTT) = Vi Vas Tia + Ves Ven Pkr, (21.5) 
where 
Tkr = ta pa — Di, Pkr = pe — pi- (21.6) 


In terms of the parametrization (20.179), (21.5) becomes 
A(B? > Kta7) = AM(p + in)Tkr + AA?’ (1 — \?/2) Pr. (21.7) 
Similarly, the amplitude for the charge-conjugate reaction is 
A(B° > K~at) = AX (p — in)Tkr + AA?’ (1 — 07/2) Pa. (21.8) 
We can now calculate the decay-rate asymmetry 


|A(B° > K-2z+)|? — |A(BO > Kta7)|/? 
Ake = 1wa A K5 (21.9) 
|A(B° > K=7+)12 + |A(B° => K+7>)|? 
To simplify things, let us take a common complex factor K out of the expres- 
sions (21.7) and (21.8) and write them as 


A(B° > Ktn-) = K(0? + Rei(Se-é2)) (21:10) 
A(B° > Kat) = KEFR EAN, a) 


where (see equation (20.183)) y is the phase of p+in, R is real, and dp — ôr is 
the difference in (strong) phases between Pkr and Tkr. Then we easily find 


2R sin y sin(or = dp) 


Akr = 1+ R2+ 2Rcosy cos(6r — dp)” 


(21.12) 


Thus we see that, for a CP-violating signal, there must be two interfering 
amplitudes leading to a common final state, and the amplitudes must have 
both different weak phases and different strong phases. An order of magnitude 
estimate of the effect can be made as follows. First, note that Px, is not 
ultraviolet divergent, since it is the difference of two penguin contributions; 
its magnitude is expected to be of order as/7 ~ 0.05. The tree contribution 
in (21.7) carries an extra factor of A? ~ 0.05 as compared with the penguin 
contribution, so that R is of order 1. This indicates that the asymmetry 
should be significant. 
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FIGURE 21.3 

Left-hand part: tree diagram contributions to B~ > D°K~ (upper diagram, 
via quark transition b > cús), and to B~ > D°K~ (lower diagram, via quark 
transition b — ués). Right-hand part: decays of DO and DO to the common 
ata state. 


Indeed non-zero values of Ax, have been reported by both the BaBar and 
Belle collaborations: 


BaBar (Aubert et al. 2004) : Akr = —0.133 + 0.030 + 0.009 (21.13) 
Belle (Chao et al. 2005) : Ax, = —0.113+ 0.022 + 0.008 (21.14) 


where the first error is statistical and the second is systematic. 

Altough Ax, is sensitive to the CP-violating angle y, it is not easy to 
extract y cleanly from these measurements. Both the tree and the penguin 
amplitudes involve non-perturbative factors for producing a particular meson 
state from the corresponding qq state; the strong phases are also not calcula- 
ble. 

A decay with no penguin contributions, but still with two interfering chan- 
nels, would have fewer uncertainties. (It is also less likely to be affected by new 
physics, which could provide short-distance corrections to penguin loops.) One 
such example is provided by the decays (i) B- > DOK- and (ii) B- > D°K~, 
which can interfere when the (D°K~) and (D°K~) states decay to a common 
final state. Here the quark transition in (i) is b > cús, and in (ii) is b > ués; 
in neither case is a penguin contribution possible. 

The tree-level diagrams which contribute are shown in the left-hand parts 
of figure 21.3 (we shall discuss the right-hand parts in a moment). We denote 
the amplitude for B- => D°K~ by Ap, and note that Ag ~ AX. The 
amplitude for B7 — D°K~, Ap, differs in three ways from Ap: (i) it is colour- 
suppressed by a factor 1/3 since the ¢ and u have to be colour matched; (ii) 
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it contains the factor Va» VX ~ AA3(p — in); (iii) it will have a different strong 
interaction phase. With these factors in mind, we write 


Ap = rp Agel) (21.15) 


where dg is the difference in strong phases between Ae and Ap, and rg is 
the magnitude ratio of the amplitudes. Since |p — in| ~ 0.38, rg is of order 
0.1-0.2, allowing for the colour suppression. 

Once again, the asymmetry is proportional to 


|1 + rgexpli(58 — Y] |? — |1 + rgexpli(58 + 1)] |? ~ 4rg sin Sp siny. (21.16) 


This involves y, but the relative smallness of rg tends to reduce the sensitivity 
to y. An alternative determination of y can be made (Attwood et al. 2001, 
Giri et al. 2003) by making use of three-body decays (to a common channel) 
of DO and DO, such as DO, DO > Kgata-. If we denote the amplitude for 
DO + Kgr*a~ by A(s_,s4) (see figure 21.3), where s_ = (px + p-) and 
s, = (pk + p,+)? are the indicated invariant masses, then the amplitude for 
the BT to decay to K7Kg r+ via the DO path ist 


A_ = ApD[A(s_,s4) + rgs A(s4, s_)], (21.17) 
and the amplitude for the charge conjugate reaction Bt > Kg a~ a7 is 
Ay = ApD[A(s4,s_) + rea ®t A(s} — s4)], (21.18) 


where D is the D meson propagator. The event rate for the B~ decay is then 
T_(s_,s+4) where (Aubert et al. 2008) 


P_(o.484) « [As 84)2 + BAe )P+ 
2 |[w_Re{ A(s_, s4)A*(s4, 5_)} + y_Im{A(s_, s4)A*(s4,5_)}] (21.19) 


and the rate for Bt decay is Ty (s_,s,) where 


T+ (s_,84) oc | 4(84,5-)" + rp|A(s_,94)|?+ 
2 [x Ref{A(s+, s_)A*(s_,54)} + yzIm{A(s4, 5_)A*(s_,54)}]. (21.20) 


Here 


| 


CH 


rg cos(6B — y), y- =rpsin(dp — y) (21.21) 


24 = rpcos(6B +), y+ =rBsin(6B + y). (21.22) 


The geometry of the CP-violating parameters is shown in figure 21.4. Note 
that the separation of the B~ and B* positions in the (x,y) plane is equal to 


1We are neglecting D°-D° mixing and CP asymmetries in D decays, which are at the 
1% or less level (Grossman et al. 2005). 
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(£, y-) 


(r+, y+) TB 


FIGURE 21.4 
Geometry of the CP-violating parameters £+, yx. 


2rp|siny|, and is a measure of direct CP violation. The angle between the 
lines connecting the B~ and B* centres to the origin (0,0) is equal to 27. 

If the functional dependence of both the modulus and the phase of A(s_, s+) 
were known, then the rates would depend on only three variables, rg, dp, and 
y (or equivalently on 4, y+). In fact, A(s_,s¿) can be determined from a 
Dalitz plot analysis of the decays of DO mesons coming from D*+ — Dort 
decays produced in ete” — ct events; the charge of the low-momentum r+ 
identifies the flavour of the DO. Such an analysis is a well-established tool 
in the study of three-hadron final states, originating in the pioneer work of 
Dalitz (1953), in connection with the decay K — 3m. The partial rate for 
D°(D°) + Ksrtn- is (see the kinematics section of Nakamura et al. 2010) 


dT a |A(s_,84)|?ds_ds,. (21.23) 


The physical region in the s_,s plane is a bounded oval-like region, which 
would be uniformly populated if A(s_,s4) were a constant. In reality, the 
decay is dominated by quasi two-body states, in particular 


D- > K*-(s_)rt (CA) 
> K*t(si)m7 (DCS) 
=> Kgp%(so), (CP) (21.24) 


where (CA) means CKM-favoured, (DCS) means doubly CKM-suppressed, 
and (CP) means that it is a CP eigenstate. The Dalitz plot shows a dense 
band of events at s_ = Tig corresponding to the K*~ resonance, a band 
at sy = Mi.,, and a band at so = m2, where so = (pp+ + p,-)* and 
84 +s- + so = ms + m2 +2m2. 

The Dalitz-plot analysis proceeds by writing (Aubert et al. 2008) A(s_, s+) 
as a coherent sum of terms representing the quasi two-body modes, together 
with a non-resonant background. Once A(s_, s+) is known, it is inserted into 
T+(s-,s+) to obtain (7+, y+) from the Dalitz plot distributions of the signal 
modes of the BF decays. From these, the quantities rg, ôg and finally ô can 
be inferred. 
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This method has been applied by both BaBar and Belle to determine y. 
Their most recently published results are 


BaBar (Aubert et al. 2010): y = 68+144+4+3° (21.25) 
Belle (Poluektov et al. 2010): y = 78.41198 +3+8.9° (21.26) 
where the last uncertainty is due to the D-decay modelling (which ignores, 


for example, rescattering among the three final state particles). Both these 
experiments use decays BE — DK*+, B+ > D*K* with D* > Dr and D* > 
Dy; BaBar in addition uses the decays DO > KgK+K”. 

We now turn to the other main method of detecting CP violation, through 
the interference between decays of (for example) BO and BO mesons that have 
been produced in a coherent state by mixing. For this we need to set up the 
formalism describing time-dependent mixing. 


_.  >PEEE <P> S 


21.2 CP violation in B meson oscillations 


B°-B? oscillations have been studied by the BaBar and Belle collaborations at 
the PEP2 and KEKB asymmetric ete” colliders. These machines operate at 
a centre of mass energy equal to the mass of the T(4$) resonance state, which 
is some 20 MeV above the threshold for BO BO production. If produced in a 
symmetric ete” collider (with equal and opposite momenta for the e* and 
e”), the produced B mesons would move very slowly, v/c ~ 0.06, covering a 
distance of only some 30 um before decaying (cr for B mesons is about 460 
um). This would make it impossible to resolve the decay vertices of the two 
Bs, as is required in order to observe B°-B? oscillations, since the accuracy of 
the decay vertex reconstruction is roughly 100 wm. Oddone (1989) suggested 
making et e” collisions with asymmetric energy colliding beams, so that the 
B mesons now move with the motion of the centre of mass, which can be 
considerable. For example, at PEP2 (e~ 9 GeV, et 3.1 GeV) Bem ~ 0.5 
and Yem ~ 1.15, so that the distance travelled in the (asymmetric) lab frame 
during the lifetime of an average B meson is ~ 250 um, which is measurable. 
At KEKB (e7 8 GeV, e7 3.5 GeV), BemYem ~ 0.425. 

Since the Y(4S) state has J = 1, the decay Y — BB leaves the B mesons in 
a p wave state, which is forbidden for two identical spinless bosons; therefore 
one must be a BO and the other a BO, but we do not know which is which until 
one has been identified (‘tagged’) in some way. The flavour of the tagged B 
may be determined, for example, by the charge of the lepton emitted in the 
semi-leptonic decays B? > D-/tw,BO 4 DH De. We shall not describe the 
evolution of the BB coherent state following production; interested readers 
may consult Cohen et al. (2009) for an instructive discussion, which also 
covers neutrino oscillations. We shall be interested in the time dependence of 
the state of the meson which partners the tagged meson, once the correlated 
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state has been collapsed by the tagging at time t = 0 say; the partner meson 
will be reconstructed by its decay products. Note that the partner meson 
can decay earlier or later than the tagged one; its state vector has that time 
dependence which ensures that it becomes the correlate of the tagged particle 
at t = 0. 


21.2.1 Time-dependent mixing formalism 


We denote the neutral meson by B (which will usually be B°, but could also be 
K? or DO), and its CP-conjugate by B. According to the theory of Weisskopf 
and Wigner (1930a, 1930b) (see also appendix A of Kabir 1968) a state that 
is initially in some superposition of |B) and |B), say 


(0) = a(0)|B) + b(0)|B), (21.27) 
will evolve in time to a general superposition 
|w(t)) = a(t)|B) + b(t) |B) (21.28) 


governed by an effective Hamiltonian H with matrix elements, in the 2-state 
subspace, 
T A pP 
H=M-i5=( 2 A ) (21.29) 
where M and T are Hermitian, and the equality M11 —il 11/2 = M22—il 22/2 = 
A follows from CPT invariance, which we shall assume. If CP is a good 
symmetry, then 


(BJH|B) = (B\(CP)~'(CP)H(CP)~!CP|B) 
(B/H|B) (21.30) 
so that p would equal q. Since M and T are both Hermitian, this would imply 
that Mı2 and T12 are both real; in the CP non-invariant world, this is not 


the case. 
The eigenvalues of H are 


wL = mu —ilu/2 = A + pq, wy = my — ily /2 = A — pq, (21.31) 
and the corresponding eigenstates are 


IB) (PIB) + alB))/(lpl? + |a|?)'/? 
Br) = (p1B) —a1B8))/(lpl* + |ql?)¥”?. (21.32) 


The states |B), |Buy) have definite masses my, my and widths Tu and Ty. The 
widths IL, Lu are equal to a very good approximation for B and D mesons, 
because the Q-values of both are large; in the case of K-mesons (see section 
21.3), one state decays predominantly to 27 and the other to 37, with different 
Q-values, and the lifetimes are very different. 


II 
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Suppose now that at time t = 0 the ‘tag’ shows that a BO has decayed. 
Then the partner is a BO at t = 0, described by the superposition 


|B°) = AP (Bu) — |B)). (21.33) 


At a later time t in the BO rest-frame, this state evolves to (problem 21.1) 


|BY) = g+ (t) |B) + (p/a)9-(t)1B (21.34) 
where 

gi (t) = e-imste-t/2 cos(Amgt/2) (21.35) 

g_(t) = ie-imate-I*/2sin(Amgt/2) (21.36) 


with mg = (mu + mu) and Amg = my — mu. Note, from (21.34), that the 
state which started as a BO at t = 0 develops also a BO component at a later 
time. Similarly, if the tag shows that a BO has decayed, the partner meson at 
t = 0 is a B®, and its state evolves to 


|B?) = (a/p)9-(t)1B") + g+ (t)|B°). (21.37) 


Consider first the semileptonic decays of BO and BO, where the only tran- 
sitions that can occur are 


B? > fă, Bak HX. (21.38) 


The state |B?) of (21.34), however, which was pure BO at t = 0, will be able to 
decay to a positively charged lepton via the admixture of the |B°) component; 
similarly negatively charged leptons may appear in the decay of |B?). From 
(21.34) and (21.37) we obtain directly the amplitudes for these ‘wrong sign’ 
transitions: A i 

(6-2 XIRalB9) = (3/09 E) U X Ral?) (21.39) 


and 
Ut eX |Hsi|B?) = (p/4)g- (E) Ut veX |Ha[B”), (21.40) 
where Hg) is the relevant semileptonic part of the complete weak current- 
current Hamiltonian. Hence the semileptonic asymmetry is 
T(B? > ltoX) —T(BP > fm X)  1-—|g/pl* 
T(B? > +X) + T(B9 > (pă) 1+la/pl 


AsL = (21.41) 


independent of time. In (21.41) we have used the fact that (¢~7X|Hs|B°) = 
(¢+veX|Hg|B°)*. The upper bound on Asr is of order 107? (Nakamura et al. 
2010). At the present level of experimental precision, it is a very good approx- 
imation to take |q/p| = 1. Since q/p = [(Mjy — iT ¢,/2)/(Mi2 — iP 12/2)]'/?, it 
follows that in this approximation we can neglect T12, and the phase of q/p is 
just minus the phase of Mj. 
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FIGURE 21.5 i 
Box diagram contributions to B°-B° mixing. 


In the Standard Model, the B°-B° mixing amplitude occurs via the box 
diagrams of figure 21.5. The box amplitude is approximately proportional 
to the product of the masses of the internal quarks, and in this case the t 
quark contribution dominates (the magnitudes of the CKM couplings are all 
comparable). The phase of Mia is then that of (V,;4Vin)?, which is the phase 
of ((1 — p — in)*)? in the parametrization of (20.179), which in turn is equal 
to the angle 28. Hence 

(a/p) =e", (21.42) 


neglecting terms of order At. Equation (21.42) will be important in what 
follows. 

From (21.34) we can now read off that the probability that the state |B?) 
(which — we remind the reader — is the partner of the state tagged as a BO at 
t = 0, and which is pure BO at t = 0) decays as a B® at t Æ 0, is |g+(t)|? = 
exp(—T't) cos? Am3t/2. Similarly, the probability that this state decays as a 
BO at time t is exp(—It) sin? Ampt/2, taking |(p/q)| = 1. Hence the difference 
in these probabilities, normalized to their sum, is cos Ampt. Measurements 
of this flavour asymmetry yield the value of Amp, currently (Nakamura et al. 
2010) 


Amg = 3.3337 + 0.033 x 10-10 MeV. (21.43) 

More generally, we define decay amplitudes to final states |f) by 
As = (f\Hwx[B°) , Ap = (f Hwne|(B°) (21.44) 
Ay = (flHwxlB°) , Az = (SH BO, (21.45) 


where CP|f) = |f) and Hyx is the weak interaction Hamiltonian responsible 
for the transition. We can now calculate the rates for |B?) to go to |f), and 
for |B?) to go to |f); up to a common normalization factor, which we omit, 
these rates are (problem 21.2) 


MB; >f) = Let]; + |(p/a) Asl? + (Ar — I(p/a) As?) cos Amgt 


+ 2Im(A,245) sin Amet}, (21.46) 
p 
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and 
1 = z 
T(B; >f) = ze la + \(q/p) Agi? + (1451 — \(a/p) Ag|?) cos Ampt 


— 2Im(A,245) sin Amat). (21.47) 
p 


The rates to |f) are obtained by the substitutions Ay > Ar, Ap> Aj. 

We can now derive the basic formulae for the time-dependent CP asym- 
metry of neutral B decays to a final state f common to BO and BO (problem 
21.3): 


„a DUBE re) TBS] ae 
A: TEI fy TB) f) > Sp sin(Amp t) — Cf cos Amgt) (21.48) 


2ImAf 1—|A;|? (2) 
= LL EL —). 21.49 
TOFIA AA Ay (21-49) 
21.2.2 Determination of the angles a(02) and 6(¢,) of the 
unitarity triangle 


A very large number of measurements have been made, constraining the pa- 
rameters of the CKM matrix, or equivalently the unitarity triangle of figure 
20.10. We shall limit our discussion to those measurements which determine 
the angles a(¢2) and 8(¢1) of the triangle. 


(i) The angle 8 (61) 
One of the cleanest examples is the decay 
B? > J/Y + K$ 1- (21.50) 


The tree diagram is shown in figure 21.6(a), and the penguins in figure 21.6(b). 
The tree diagram contributes CKM factors Vä Ves = AA?(1—A?/2). Thef = 0 
penguin has factors Vä Vas = AM (p—in) which is suppressed by two powers of 
A; it also carries a loop factor ~ a;/7, and it may therefore be safely neglected. 
The other two penguins have the same weak phase as the tree diagram. Hence 
to a good approximation we can write the amplitude as 


Ayr = (Von Ves) Tyr. (21.51) 


There is one subtlety: to get the two final states from BO and BO to interfere, 
we need K°-K° mixing to produce the (very nearly) CP eigenstates Kg (CP 
= +1) and K? (CP = —1). (We shall discuss the KO — KO system briefly 
in section 21.3.) This introduces a factor (q/p)k = (Vă Ves/VeaVă), quite 
analogously to (21.42), but its effect on Ayx is negligible. So, remembering 
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FIGURE 21.6 
Tree (a) and penguin (b) contributions to B° > J/~ + Kg, via the quark 
transitions b — Gc8. 


that the relative orbital angular momentum of the two final state particles is 
(= 1, we have Ayx, = —exp(—2i8) and Sy = sin 26, while the J/yK? state 
has CP=+1 and Sy, = —sin28. Hence Syk measures —ny sin 28, where ns 
is the CP eigenvalue of the J [VKS state: the sinusoidal oscillations in the 
asymmetry Ayx for the two modes S, L will have the same amplitude and 
opposite phase. 

Both BaBar and Belle have reported increasingly precise measurements 
of Ayk in these modes. The early results (Abashian et al. 2001, Aubert 
et al. 2001) were the first direct measurements of one of the angles of the 
unitarity triangle, offering a test of the consistency of the CKM mechanism 
for CP violation. Later measurements have achieved accuracies of about +5%. 
The current world average for sin 28 is (see the review by Ceccucci et al. in 
Nakamura et al. 2010) 


sin 28 = 0.673 + 0.023. (21.52) 


Figure 21.7 shows the asymmetry (before corrections for experimental ef- 
fects) for np = —1 and m; = +1 candidates as measured by BaBar (Aubert 
et al. 2007a); Belle has reported similar results. We should note that a mea- 
surement of sin 26 still leaves ambiguities in 6 (for example, 6 > 7/2 — 8), 
which can be resolved by other measurements (Ceccucci et al., in Nakamura 
et al. 2010). 


(ii) The angle a(¢2) 


The angle a is the phase between Vi, Via and Vii, Vaa. It can be measured in 
decays dominated by the quark transition b > u ū d. Consider, for example, 
the decays BO + r+r7,B% > mtr. Figure 21.8 shows the tree graph (a) 
and penguin (b) contributions to B? > r*r7. Exposing the weak phases as 
before, the amplitude is 

Aps Vib Vaa (t + pa — pe) + Vib Vta (pi — pe) 
= Vip Vaa Tp + VinVia Pp. (21.53) 


| 
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FIGURE 21.7 

(a) Number of nf = —1 candidates in the signal region with a BO tag (Ngo) and 
with a BO tag (Ngo), and (b) the measured asymmetry (Ngo — Ngo)/(Ngo + 
Ngo), as functions of t; (c) and (d) are the corresponding distributions for 
the nf = +1 candidates. Figure reprinted with permission from Aubert et al. 
(BaBar Collaboration) Phys. Rev. Lett. 99 171803 (2007). Copyright 2007 
by the American Physical Society. (See color plate IV.) 


FIGURE 21.8 
Tree graph (a) and penguin (b) contributions to B° > Y”, via quark 
transitions b + dut. 


342 21. CP Violation and Oscillation Phenomena 


Suppose first that the penguin contributions could be neglected. Then the 
asymmetry A,+,- would measure 


Ay. o VapV* 
ImAr+r- = Im (=) = Im a | 


= Ime 20+) = sin 2a (21.54) 


where a is defined as m — 8—-y. Unfortunately, this simple result is spoiled by 
the penguin contributions, which there is no good reason to ignore. However, 
Gronau and London (1990) showed how an isospin analysis could disentan- 
gle the tree and penguin parts. The method involves the three amplitudes 
AJ, Aoo(B° — Té), and A;o(Bt > +70). 

First of all, note that Bose statistics for the 27 states requires them to have 
only the symmetric isospin states J = 0 or 2, since the angular momentum is 
zero. Next, the effective non-leptonic weak Hamiltonian Hn acting in the tree 
diagram transition contains both AJ = 1/2 and AI = 3/2 pieces; combining 
with the initial J = 1/2 of the B meson, the first piece will lead only to the 
I =0 final state, while the second contributes to both J = 0 and I = 2 final 
states. However, since the gluon in the penguin diagrams carries no isospin, 
these diagrams can only change the isospin by AI = 1/2, which connects only 
to the J = 0 final state. The conclusion is that the J = 2 final state is free of 
penguins, and carries the pure tree phase. 

This information can be exploited as follows (Gronau and London 1990). 
First, the action of Hei on the B® state can be written as 


~ 1 1 1 1 
Hals —=) = = = 
Lo v2 
where as usual |7 73) is the state with isospin J and third component I3. Ex- 


panding the states rtr, ntr? and 7°7° in terms of definite isospin states, 
one finds (problem 21.4) 


1 1 
Me, E. ve fasa ae SĂ 
+ Wb 3/2 V3 1/2 
v3 
Ajo = 3 43/2 
Ae: Ea A S (21.56) 
00 — v3 3/2 E 1/2 : 


where Aj; is the amplitude (mim [H,1]B'+1). The rtz? state can have only 
I = 2, and arises solely from the tree diagram. Furthermore, the three complex 
amplitudes A}—, Ao and Aoo are expressed in terms of only two reduced 
amplitudes A3/2 and Aj/2, leading to one relation between them: 
ay Ve A (21.57) 
V2 F= 00 — 44405 $ 
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FIGURE 21.9 
The triangle formed by the three amplitudes A;; in equation (21.57). 


which can be represented as a triangle in the complex plane, as shown in figure 
21.9. There is a similar triangle for the charge conjugate processes 


1 
v2 


where the A amplitudes are obtained from the As by complex conjugating the 
CKM couplings, the strong phases remaining the same as usual. 

Since Ajo is pure tree, its weak phase is well defined, namely that of 
Vi Vaa, which is y. It is convenient to define (Lipkin et al. 1991) Â = 
exp(2iy)A, so that Ayo = Ajo. Then the two triangles have a common 
base, Ajo. The failure of the two triangles to overlap exactly is a measure 
of the penguin contribution. In principle, by measuring the asymmetry co- 
efficients Sr+r-,Cr+r-, the branching fractions of all three modes, and Coo, 
one can construct the triangles. But unfortunately the relative orientation of 
the triangles is not known, which leads to various possible solutions to a in 
the range 0 < a < 2r. In addition, the data on 77° (with a branching ratio 
of order 1074) has sizeable experimental errors, and only a relatively loose 
constraint on a can be obtained. 

A much better constraint can be found from the CP asymmetries in 
B > Tp decays (Snyder and Quinn 1993). The method is essentially a time- 
dependent version of the Dalitz plot analysis discussed in section 21.1. The 
available channels are 


Aj + Aoo = Ato, (21.58) 


B? > foto, p rt, pr} Satan 
B > {ort pr, pnl} — rtr nr? (21.59) 


where all result in the final state T+1 770 after the decay of the p mesons, 
and interferences following B°-B° mixing are possible. 

Returning then to equations (21.46) and (21.47), the rate for the 37 decay 
following a BO tag at t = 0 is 


2 1 _ 7 
ee a i ES + Mar? + (|Asnl? — |As?) cos Amst 


+ 2Im (As. A.) sin Amt , (21.60) 
p 


344 21. CP Violation and Oscillation Phenomena 


and there is a similar formula, with appropriate changes, for the case of a BO 
tag at t = 0. We now write 


Asa = falo) PH + f-(8-)F7 + fols0)F° (21.61) 
and similarly 


Asr = f4(8+)F* + f-(s-)F- + fo(s0)F°, (21.62) 
where Sy = (Dart + pro)”, s- = (Pr- + Pro)”, 50 S (Dart + pr-)?, satisfying 
s4 +s + so =m} + 2m2, +mio. fu(s,) is the sum of three relativistic 
Breit-Wigner resonance amplitudes, together with appropriate angular mo- 
mentum and angle factors, corresponding to the p(770), p(1450) and p(1700) 
resonances. F* is the amplitude for the quasi two-body mode BO > pif. 
Here « takes the values +, — and 0, and correspondingly k = —,+,0. The am- 
plitudes F'* are complex and include the strong and weak transition phases, 
from tree and penguin diagrams; they are, however, independent of the Dalitz 
plot variables. 

The pr states have the same decomposition into tree and penguin parts 
as discussed previously for the rr states, namely 


F" = eT" + ei P, (21.63) 


where the magnitudes of the weak couplings have been absorbed into T'* and 
P*. We can rewrite (21.63) as 


eP F" = elope Pr = Ar, (21.64) 


and similarly l _ DO ce 
e' (q/p)F* = el T* + P? = AN. (21.65) 


Then (21.61) and (21.62) become 


Asa 


| 


Y fels) A" (21.66) 


(q/p) Asm 


II 


NO falsk)", (21.67) 


disregarding a common overall phase e-i%. When (21.66) and (21.67) are 
inserted into (21.60), it is clear that one obtains many terms, for example 


Re(f,f*)Im(AtA~*+A7At*), Im(f,f*)Re(AtA~*— AT AT”), (21.68) 


and so on, in which different resonances interfere on the Dalitz plot. The 
strong, and known, rapid phase variation in these interference regions, via 
factors such as f+} f*, is a powerful tool for extracting the complex amplitudes 
AF, AF, and hence via (21.64) and (21.65) the phase a. The quantities mul- 
tiplying the interference terms Re(f,+ f) and Im(f, fi) are the key degrees of 
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freedom which allow this analysis to determine the penguin contributions and 
the strong phases, and hence a. However, the resonance overlap regions cover 
a small fraction of the Dalitz plot, so that a substantial data sample (a few 
thousand events) is needed to constrain all the amplitude parameters. 

An isospin analysis similar to that of the ma states can be done for the 
pr states, but now there is no reason to forbid the final state to have J = 1. 
Nevertheless, if charged B decays are also included, there are five physical 
amplitudes (p? > ntn, n nt, Mm, pt — ntn, pT — rom) which are 
expressible in terms of two pure tree (AJ = 3/2) transitions to J = 1,2 final 
states. One of the pure tree amplitudes may be written (Gronau 1991) as the 
sum At + AT +240, and hence the ratio (A+ + 47 + 24%)/(4+ + 47 +240) 
has the phase 2a. 

This approach has been followed by both BaBar and Belle, with the results 


BaBar (Aubert et al. 20070) a = 87145; (21.69) 
Belle (Kusaka et al. 2007) 68 < a < 95°. (21.70) 


These results are consistent with the values of 6 and y given in (21.52), (21.25) 
and (21.26), given the definition a = 7 — 6 — y. 

Of course, this is only one (at present not very tight) consistency check. 
But there are now very many independent measurements of the magnitudes of 
the CKM matrix elements, as well as the angles. We shall not describe these 
here, referring the reader to the regular updates by the Particle Data Group 
(currently Nakamura et al. 2010). We showed in figure 20.11 the 2010 plot of 
the contraints in the p,7 plane, presented by Ceccucci et al.. They concluded 
that the 95% CL regions all overlapped consistently around the global fit re- 
gion, though the consistency of |V">| and sin 28 was not very good. It would 
be premature to make too much of the minor reservation, though it may be 
noted that sin 2 could be sensitive to new physics via short-distance correc- 
tions to the box diagrams of figure 21.5, while |Va»| is obtained from a tree- 
level process, and is thus unlikely to be affected by new physics. Overall, the 
consistency represented in figure 20.11 must be counted as a major triumph 
of the Standard Model, in particular of the original analysis by Kobayashi 
and Maskawa (1973). It must be remembered, though, that many extensions 
of the Standard Model allow considerable room for new CP-violating effects, 
which could be revealed by increasingly precise determinations of the CKM 
parameters. 


E a 


21.3 CP violation in neutral K-meson decays 


Although the formalism is similar, the phenomenology of CP violation in neu- 
tral K-meson decays is very different from that in neutral B-meson decays. 
In the K case, CP violation is a very small effect, typically at the level of 
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parts per thousand or smaller; its observation by Cristenson et al. (1964) was 

a historic achievement. But the neutral K system is most simply (and tradi- 

tionally) approached by starting with the assumption that CP is conserved. 
We will define CP|KO) = —|K°); then the CP eigenstates are 


a 
„vă 


The CP = 1 state can decay to two pions in an s-state, but not to three pions 
if (as we are assuming to start with) CP is a good symmetry; the situation 
is the opposite for the CP = —1 state. The Q-value for the three pion mode 
is very much smaller than for the two pion mode, with the result that the 
|K_) state, decaying to two pions, has a much shorter lifetime than the |K_) 
state: Tar ~ 0.9 x 1071s, 73, ~ 5 x 1078s. Due to CP violation, the actual 
eigenstates |KL) and |Kg) of the effective Hamiltonian are slightly different 
from |K) (see (21.75) and (21.76)), with masses mg and mr, and widths T's 
and TL. At this point, however, we shall associate mg and Is with |K+), and 
my and Tr with |K_). 

A KO is produced in strangeness-conserving reactions such as Ktn + K%p, 
and a K% in KT +p > K? +n, for example. However, the two states can mix 
following production, since (as usual) it is the Hamiltonian eigenstates which 
propagate in free space, and they are the superpositions |K), assuming CP 
is conserved. Hence, as time proceeds following production, a state produced 
as a KO at time t = 0 will evolve into the state 


1 
K? = — 
IK;) 5) 


K+) (19) + |K°) (21.71) 


(e TLt/2 imut Le Tst/2 imst) 1K) 4 (e Tt/2 imit_e Tst/2 imst) RO), 


(21.72) 
The probability that a K°(K°) will then be observed at time t following pro- 
duction (in the K-meson rest frame) is 


Proy ==[e Pet +e Pst + (—)2e7 Fu tPs)#/? cos Amt] (21.73) 
where Am = my — mg. This is the famous phenomenon of strangeness 
oscillations, predicted by Gell-Mann and Pais (1955). Experimentally, the 
strangeness of the state at time t is defined by the modes KO => 7 /twp 
and KO > mt. The difference P;(t) — P_(t) is measured, and although 
the oscillations are heavily damped by exp(—Igt), the mass difference can be 
determined: 

Am = (3.483 + 0.006) x 10-12 meV. (21.74) 


However, this is not the whole story. Christenson et al. (1964) found 
that, after many Ts lifetimes, some 27 events were observed, indicating that 
the surviving state K, was capable of decaying to 27 after all (albeit very 
rarely). The same conclusion follows from the fact that P, (t) — P_(t) does 
not go to zero at long times, as it should from (21.73). Accordingly, the true 
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Box diagrams contributing to K?-K° mixing. 


Hamiltonian eigenstates are not quite the CP eigenstates, but rather 


Ko) = [(1+®|K®) + (1—®|K®)]//2(1 + el?) (21.75) 
IKs) [(1 +K?) — (1 — €)|K°)]//2(1 + |ēl?). (21.76) 
This is a traditional parametrization in K-physics, similar to that in (21.54) 


with q/p = (1 —€)/(1+ e) (this is why we chose CP |K°) = —|K°)). We now 
find that a state which starts at t = 0 as a KO evolves to 


II 


l-e€ 
l+e€ 


IE) = 9+(t)1K") + g-(t)1K" (21.77) 


where l 
g4 (t) = el Ag ety, + pr (21.78) 


with AT =Ts—TL, Am = mu — ms, and we have omitted a normalization 
factor. Similarly, a state tagged as KO at t = 0 evolves to 


IK?) = 79-01K?) + g+ (IR?) (21.79) 


The K?-K° mixing amplitude arises in the Standard Model from the box 
graphs shown in figure 21.9 (cf figure 21.5). These contain factors of mf, 
but the magnitude of the four CKM couplings to the t quark are of order A%, 
compared with XS for the c quark, so that the c quark diagram dominates, with 
a CKM factor of (VsV,4,)?, which is real to a good approximation. This means 
that Ime is very small. A comparison of the mass difference Am predicted 
from figure 21.10 and the experimental value is complicated by uncertainties 
in the hadronic matrix element. 

The traditional reactions in which CP violation is probed in K decays are 
the 27 modes, where one looks for the existence of Kg — 27. There is also 
the semileptonic asymmetry. Three common observables are defined by 


non? Hn K EMT Aa K 
noo = rn ny- = Saa (21.80) 
(TH, Kg) (ntr | Hai] Ks) 


348 21. CP Violation and Oscillation Phenomena 


and A 
T(KL > TLF) = T(Ku =EN ES ) (21 81) 


£ Ve 
T(K, > mlte) + T(K > r+l= 0) 
) 


The experimental numbers are (Nakamura et al. 2010 


ôL = 


Inoo| = (2.221 + 0.011) x 107%, |ņ+-|= (2.232 + 0.011) x 107% (21.82) 


Arg noo © 43.5°, Arg ns = 43.5° (21.83) 


and 
$ = (3.32 + 0.06) x 107°. (21.84) 


It is useful to describe the final 27 states in terms of their isospin, which 
then have a definite strong interaction phase. As noted in connection with the 
B decays, the allowed isospin states are only J = 0 and J = 2, and one has 


e 1 A 
Ag = night aoje + ft [Azjelló=+92) (21.85) 
Ae = ‘Api star saa je(o—40) — ia jei(62—%2) (21.86) 


where the minus sign arises from our choice CP|K°) = —|K®°), and where 97 
and gr are the strong and weak phases, respectively, for the state with isospin 


I. Also, 
1 Aole) — ,/2 | A, 101(02+02) 
Ago AKo 7070 = 3 |Aole OT POs — 3 |Agle ee (21.87) 


7 R 1 l 2 
Ang. = Axis hoe = Sa |Aoei(60—%) + "a | A2|e (92742). (21.88) 


The significant fact experimentally is that |A2|/|Ao| ~ 1/22, a manifestation of 
the ‘AI = 1/2’ rule in this case (i.e. AJ = 3/2 is suppressed; see, for example, 
Donoghue et al. 1992, section VIII-4). Inserting (21.85) and (21.86), (21.87) 
and (21.88), into (21.80) and retaining only first-order terms in |A2|/|Ao|, and 
treating do and ¢2 as small, we find (problem 21.5) 


Noo = €+i¢o — V2 a oe — go)ei(&2 — do) (21.89) 
= 1 JA]. NE 
n+- = €+ibdo += J Aol i(d2 — doje l (21.90) 
These relations are usually written as 
Noo =6— 2e, nm =e+é, (21.91) 
where 
/ el(S2— 40) | 42] 


E=E+ ido, e =i (da a= do). (21.92) 


v2 Mol 
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The merit of this manoeuvering is that the parameter € involves only the CP 
violation in the transition amplitude (‘direct CP violation’), while e involves 
both a transition phase and the mixing parameter €. 

What can experiment tell us about e and e'? Consider first ô. Assuming 
that |A(KO > ¢trer-)| = |A(K® > £75prF)| and that A(K® > (oder?) = 
A(K% — tver”) = 0, we find 


$, = 2Reé/(1 + |e?) ~ 2Ree, (21.93) 


so that ôr is sensitive to the same parameter as appears in the Ku > rr 
decays. An interesting observable is the ratio between the ratios of the decay 
rates to m+ and mr of Ks and Ky. One finds 


E ( EA ol) = Re (e /e), (21.94) 


6 [+12 


which from equation (21.82) is another small number, approximately equal 
to 1.64 x 107°. In the years before the B factories opened, e” was the only 
window into CP violation in the transition amplitude. But all the branching 
ratios in (21.94) are of order 1073, and establishing a non-zero value of e” was 
very difficult. The first claim for non-zero e” was by the NA 31 experiment at 
CERN (Barr et al. 1993), a 3.5 standard deviation effect. But a contemporary 
experiment at Fermilab (Gibbons et al. 1993) found a result compatible with 
zero. The next generation of experiments produced agreement: 


Re(e/e) = (2.07) + 0.28) x 107° Alavi-Harati et al. 2003 (KTeV) 
(21.95) 
Re(e'/e) = (1.47 +0.22) x 1073 Batley et al. 2002 (NA 48). (21.96) 


The current world average is (1.65 + 0.26) x 1073. Fits to all the data also 
yield (Nakamura et al. 2010) 


le] = (2.228 + 0.011) x 107%. (21.97) 


The experimental value of ô, gives us Ree = 1.66 x 1073, and we can deduce 
that arge ~ 7/4. The phase of € is 7/2 + 62 — dy which happens also to be 
approximately 7/4. It follows that ¢’/e is very nearly real. 

Comparison of these small numbers with theoretical predictions is com- 
plicated by hadronic uncertainties, and it is beyond our scope to pursue that 
issue. 

In closing this discussion of mesonic mixing and CP violation, we briefly 
discuss the charm sector. First, we note that D°-D° mixing has been observed 
(Aubert et al. 2007c, Staric et al. 2007, Aaltonen et al. 2008). CP-violating 
effects in charm decays have been generally expected to be very small. A 
rough estimate of the direct CP-violating asymmetries in D decays can be 
made following the method of section 21.1. Consider, for example, the decays 
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DO > K+K- and DO > K7K*. As in (21.5) and (21.10) , the amplitude for 
the first decay is 


A(DO > KK) = ViVusTkK + Vå Va Pkk (21.98) 
T(1 + rg exp), (21.99) 


where rx is the relative magnitude of the penguin contribution, and 6x is 
the relative strong phase. The amplitude for the CP-conjugate process is the 
same, with y replaced by —y. The penguin contribution is CKM-suppressed 
by a factor Vi Var /VVas ~ A, and there is also a loop factor, so that rk 

would seem to be of order 1074. The asymmetry is then 
fe, a |A(D° > K*K7—)|? — |A(D° > K-K*)/? (21.100) 

|A(D° > K+K>)|2 + |A(D° > K-K>+)|2 

= 2rksinysinók, (21.101) 


which is indeed very small. A similar argument predicts the asymmetry in 
the decays DO — rtr- and DO — nor? to be 


AD, = —2rxk sin y sin dx. (21.102) 


Recently, however, the LHCb collaboration has published a measurement 
of the difference between the time-integrated CP asymmetries in the KK and 
nr decays, which to a very good approximation can be identified with the 
difference between the direct asymmetries (21.101) and (21.102). The LHCb 
result is (Aaij et al. 2012) 


AR — AD. = (—0.82 + 0.21 + 0.11)%, (21.103) 


which is substantially larger than the estimates (21.101) and (21.102). 

It is possible that this 3.5 o effect (the first evidence for CP-violation in 
the charm sector) indicates the presence of some new physics. However, it 
must be noted that the mass scale of the charm quark, me ~ 1.3 GeV, is not 
large enough to be safely in the perturbative QCD regime (as indicated by 
the parameter Ayqg/m-), so that non-perturbative enhancements are possi- 
ble. CP-violation in the charm sector promises to be an interesting area for 
experimental and theoretical exploration. 


În ooo oo Bo 
21.4 Neutrino mixing and oscillations 
21.4.1 Neutrino mass and mixing 


Experiments with solar, atmospheric, reactor and accelerator neutrinos have 
established the phenomenon of neutrino oscillations caused by non-zero neu- 
trino masses, and mixing. We shall give an elementary introduction to this 


21.4. Neutrino mixing and oscillations 351 


topic, which is a highly active field of research in particle physics; there are 
analogies with the meson oscillations we have been considering. 

It is fair to say that in the original Standard Model the neutrinos were 
taken to be massless, but there was no compelling theoretical reason for this, 
and the framework of the Standard Model can easily be extended to include 
massive neutrinos. However, one question immediately arises: are neutrinos 
Dirac or Majorana fermions? As explained in section 20.3, we do not yet know 
the answer, and it may be some time before we do. The way the mass terms 
enter the Lagrangian is, in fact, different in the two cases. We are familiar 
with the Dirac mass term 


mýý = mda + dive); (21.104) 


where w is a four-component Dirac field, and R and L refer to the chirality 
components. We learned in section 7.5.2 that a Majorana mass term can be 
written in the form 

miog + h.c. (21.105) 


where XL is a two-component field of L chirality. A similar expression could 
be written using a two-component R-chirality field. The difference in form 
between the Dirac and Majorana mass terms leads to a difference in the 
parametrization of neutrino mixing, as we shall see. 

Suppose, first, that the neutrinos are Dirac particles, with both L and R 
chiralities (or equivalently either helicity) for a given mass. We remind the 
reader that this is not ruled out experimentally, since the non-observation of 
the ‘wrong’ helicity component may be accounted for by the appearance of a 
suppression factor (m/E), where m is a neutrino mass and E is an average 
neutrino energy (see section 20.2.2). We also assume that their interactions 
have the V-A structure indicated by the phenomenology of the previous chap- 
ter. Then only the L (R) chirality component of a neutrino (antineutrino) field 
feels the weak force; the R (L) component of a neutrino (antineutrino) field 
has no interactions of Standard Model type. But, just as in the quark case, it 
will in general be necessary to allow for the possibility that the L-components 
of the fields which have definite neutrino mass, call them 2,1, Dor, Da, are not 
the same as the fields DeL, Du, Pr which enter into the charged current V-A 
interaction. For Dirac neutrinos, we therefore write 


De Va Va Ue Di Di 
Du = Up U ua Uns Da =U Da 4 (21.106) 
Dr L Ur Uz2 Uz3 D3 L D3 L 


where the unitary matrix U is the PMNS matrix, named after Pontecorvo 
(1957, 1958, 1967), and Maki, Nakagawa and Sakata (1962). 

Now we showed in section 20.7.3 that the general 3 x 3 unitary matrix has 
three real (rotation angle) parameters, and 6 phase parameters, five of which 
we could get rid of by rephasing the quark fields by global U(1) transformations 
of the form 4! = exp(i@)g. Such rephasing transformations are equally allowed 
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for the charged leptons, and also for Dirac neutrinos, since evidently the mass 
term (21.104) is invariant under a global U(1) transformation w’ = exp(i0)4. 
Hence the matrix U will, in this Dirac case, have a parametrization of the 
CKM form, with one CP-violating phase. 

The mixing described by (21.106) implies that the individual lepton flavour 
numbers Le, Lu, Lr are no longer conserved. However, since we are here taking 
the neutrinos to be Dirac particles, there will be a quantum number carried by 
Ve, Vu and v, which is conserved by the interactions. This could, for example, 
be the total lepton number Le + L, + L7, assigning L(va) = 1 for a =e, 1,7, 
which would follow from invariance under the global U(1) transformation /”, = 
expli), D}, = exp(id)i., where 6 is independent of the flavour a. 

This ‘Dirac’ option, though simple, may be thought uneconomical, how- 
ever. As noted, the R components of neutrino fields have no interactions of 
Standard Model type. The charged leptons do have electromagnetic interac- 
tions, of course, as do the quarks, which also have strong interactions. But 
the neutral neutrinos only have weak interactions, which involve only their 
L-components. Why, then, enlarge the field content to include hypothetical 
Pa fields, which don’t have any SM interactions? It seems more economical to 
make do with only the ô; fields. In this case, the Dirac mass term (21.104) is 
not possible, but a Majorana mass term (21.105) can still exist. Clearly, such 
a mass term is not invariant under U(1) global phase transformations, and it 
breaks lepton number conservation explicitly. As in the Dirac case, the chiral 
L component will include a ‘wrong’ (i.e. positive) helicity component with an 
amplitude proportional to m/E. 

The fact that global phase changes on the neutrino fields are now no longer 
freely available, because that symmetry is lost if they are Majorana fields, has 
implications for the mixing matrix, call it Uy, in this case. Since the three 
Majorana neutrino fields can no longer absorb phases, we have only the three 
phases from the charged leptons at our disposal, which leaves three phase 
parameters in Uy, after rephasing. The PMNS matrix in the Majorana case 
therefore has two more irreducible phase parameters than the CKM matrix, 
and is conventionally parametrized as 


Um = U(CKM - type) x diag.(1, e?1/2, e'21/?), (21.107) 


There are three CP-violating phases in the Majorana neutrino case. 

The only information at present (2012) concerning the entries in U comes 
from neutrino oscillation experiments, which we shall discuss in the next sec- 
tions. We shall see that the Majorana phases a2; and a3; cancel in the 
probabilities calculated for neutrino transitions, and no experiment so far is 
sensitive to CP-violating effects in the neutrino sector. We shall discuss how 
the values of the parameters 02, 013 and 023 can be inferred from the observed 
oscillations, and also the differences in the squared masses of the neutrinos. 
Anticipating these results, we state here that the two independent squared 
mass differences, m3 —m? and m3 —m?, turn out to be very small indeed, and 
rather different from each other: namely approximately 7.6 x 1075 eV? and 
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2.4 x 1073 eV’, respectively. The smaller value is associated with oscillations 
of solar or reactor neutrinos, and the larger with oscillations of atmospheric 
or accelerator neutrinos. 

Data on the actual mass values are limited. There is a bound on the 
De mass from measurements of the electron spectrum near the end-point in 
tritium (-decay, which gives (Lobashev et al. 2003, Eitel et al. 2005) 


my, < 2.3eV 95%CL. (21.108) 


A weaker limit on m,, comes from measurements of the muon spectrum in 
charged pion decay: 
my, < 0.19 MeV 95%CL. (21.109) 


The strongest upper bound comes from cosmology, assuming three neutrinos. 
The Cosmic Microwave Background data of the WMAP experiment, combined 
with supernovae data and data on galaxy clustering, can be used to obtain an 
upper limit on the sum of three neutrino masses (Spergel et al. 2007): 


3 
SN my, < 0.68 eV, 95%CL. (21.110) 
1+1 


Taking the squared mass differences as indicative of the actual mass scale, 
neutrino masses are evidently very much smaller than the masses of the other 
fermions in the Standard Model. We shall return to what this might tell 
us about the origin of neutrino mass in section 22.5, where we discuss how 
gauge-invariant masses are generated in the Standard Model. 

Returning to the question of CP violation, we noted in section 4.2.3 that 
the CP violation present in the Standard Model was insufficient to account for 
the matter-antimatter asymmetry in the universe. However, we now see that 
it is possible to have CP violation in the lepton sector, in an extended Stan- 
dard Model with massive neutrinos. Leptonic matter—antimatter asymmetries 
can be converted into baryon asymmetries in the very hot early universe by a 
non-perturbative process predicted by Standard Model dynamics — a process 
called leptogenesis (Fukugita and Yanagida 1986, Kuzmin, Rubakov and Sha- 
poshnikov 1985). It has been argued that the Dirac and/or Majorana phases 
in the neutrino matrix U or Uy can provide the CP violation necessary in 
leptogenesis models for the generation of the observed baryon asymmetry of 
the universe (Pascoli et al. 2007a, 2007b). If such a proposal should prove 
to be the case, the reach of Pauli’s ‘desperate remedy’ will have been vast 
indeed. 


21.4.2 Neutrino oscillations: formulae 


The existence of neutrino oscillations means that if a neutrino of a given 
flavour vala = e, u, 7) with energy E is produced in a charged current weak 
interaction process, such as 7+ — put v,, then at a sufficiently large distance L 
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from the va source the probability P(v, — vg; E, L) of detecting a neutrino of 
a different flavour vg is non-zero.” Such a flavour change will of course imply 
that the va survival probability, P(vq —> va; E, L), is less than 1. We shall 
give a simplified version of the derivation of such probabilities, following the 
approach of the review by Nakamura and Petcov in section 13 of Nakamura 
et al. (2010). This review includes a large list of references to the time- 
dependent formalism; we mention here the contributions of Kayser (1981), 
Nauenberg (1999) and Cohen et al. (2009). We shall treat all the neutrinos 
as stable particles. 

We consider the evolution of the state |v.) in the frame in which the 
detector which measures its flavour is at rest (the lab frame). As in the meson 
case, the states with simple space-time evolution in a vacuum are the mass 
eigenstates |1;) (i = 1, 2,3), a superposition of which is equal to |ra): 


Yo) = X Užalvi ps), (21.111) 
j 


the complex conjugation arising from taking the dagger of the relation (21.106) 
for the field operators. Here U stands for either the Dirac or the Majorana 
matrix, and p; is the 4-momentum of v;. Similarly, 


|?) = X Uoi|Pi, pi). (21.112) 


We will consider highly relativistic neutrinos, as is the case for the experiments 
under discussion. We will assume that there are no degeneracies among the 
masses mj. The states in the superpositions (21.111) and (21.112) will all 
have, in general, different energies and momenta F;,p;. We shall also treat 
the evolution as occurring in one spatial dimension, taking all the momenta 
to lie in the direction from the source to the detector. Note that the fractional 
deviation of E, and p; from the massless case E = p is of order m? / E? which 
will be extremely small, of order one part in 1016, say. 

Suppose now that the neutrinos of flavour va started in the state (21.111) 
at time t = 0 in the detector frame are detected at time T after production, 
having travelled a distance L. Then the amplitude for finding a neutrino of 
flavour vg at (L,T) is 


Alva > Ug: L,B) = J Uue EHP (vevi, pi) 
a 


= X U} Ue re (21.113) 
We make two immediate comments on (21.113). First, the Majorana phases 
in (21.54) cancel in A(va — vg; L, E) = dag, since the same phase appears 


2We shall not indicate the chirality explicitly from now on, it being assumed that we are 
referring to the L (R) component for neutrinos (antineutrinos). 
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in Ug; and Ugi. We conclude that oscillation experiments cannot distinguish 
Majorana from Dirac neutrinos. Second, if the neutrinos were massless, the 
phase factors in (21.113) would all be unity, and then A(Va — vg; L, E) = bag, 
from the unitarity of the matrix U, so there would be no flavour change. 

Flavour oscillations come about via the interference in |A(va — vg; L, T) 
between phase factors that are slightly different from one another, because of 
the different masses. A typical interference phase is then ¢;; = (E, — E;)T — 
(pi — p;)L. Following the review by Nakamura and Petcov in Nakamura et al. 
(2010), we note that 


| 2 


mi — m5 _ (Ei — pi) (Ej pi) Ei +E; 
i = DP = (BET (pp) (21.114) 
pi + Dj Pi + Dj (pi + py) 
so that 


E; + E; z| m? = m? 


bi; = (Ei — E; r- 
x= 1) Di + pj Di + Dj 


(21.115) 


Bearing in mind that the energies differ from the momenta by terms of or- 
der m2/E2, we see that the first term in (21.115) can be dropped, and the 
interference phase is, to a very good approximation, 


TO Ses 21.116 
di = oR = OE oor 


where E is the average energy, or momentum, of the neutrinos. We therefore 
obtain the probability 


P(va — vg; L,E) = Y |Uail”/Upil? (21.117) 


$ + Ami, 
+ 250 |Up:Už; Uaj Us| cos aE | L7 banii 


i>j 


where 
PaBiij = Arg (Us Ud; UajU y). (21.118) 


A more useful expression can be obtained by using the unitarity of U (problem 
21.6): 


$ „va 2 AM; 
P(va > vg; L,E) = dag— 4) Re (UgiUt,UajU§,) sin 


i>j 


A Ami, L 
+ 2) Im (UgiU2,U.;U3,) sin ar (21.119) 


î>j 


The expression for P(Va — Dg; L, T) is the same, except for a change in sign 


356 21. CP Violation and Oscillation Phenomena 
of the last term in (21.119): 


iy Am; L 
P(Da > Dg; L,E) = ap 4) Re (UgiUtUajU%,) sin IF 
i>j 


, oe, Ami¿L 
— 2 Y Im (Ug: U%ž:Uaj U%;) sin aE (21.120) 


i>j 


It follows from (21.119) and (21.120) that P(va — vg; L, E) = P(Dg > Va; 
L, E), a consequence of CPT invariance. CP alone requires P(va — vg; L, E) 
= P(Da — Dg; L, E). A measure of CP violation is provided by 


AGS = P(va 3 vg; L, E) — P(g > Dg; L, E) 
Am?, 
= TT* TT* : 2) 
= ¿Y Im (Ups seg Gs) sn seek (21.121) 


The reader will recognize the Jarlskog (1985) invariants in (21.121). In this 
3 x 3 mixing situation, which is exactly analogous to quark mixing, all these 
invariants are equal up to a sign, and (21.121) becomes (Krastev and Petcov 
1988) 


Aig = —Ag = Age 
Am? Am? Am? 
= 4J, lsi 321 E 217 A 137 

J, sin oF ) + sin ( 3 + sin OB 

(21.122) 
where 

J, = Im(U,gU*,UeoU*y). (21.123) 
If any one mass-squared difference is zero, say Am3,, then Am3. = —Am?s, 


and the right-hand side of (21.122) vanishes: we need all three mass-squared 
differences to be non-zero, in order to get CP violation. 

In proceeding to discuss the experimental situation, it will be useful to 
define an ‘oscillation length’ ,;(£) given by 


(E/GeV) 


As (E) = 2E/Am?2, x 0.4 22 
ji ) / Mi; (Am?,/ev?) 


m. (21.124) 


In practice, the three-state mixing formalism can often be simplified, making 
use of what is now known about the neutrino mass spectrum. One squared 
mass difference is considerably smaller than the other: 


[Am2 | ~ 7.6 x 107° eV?, |Am3,| ~ 2.4 x 107%eV?. (21.125) 
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Suppose now that L/|A3¡(E)] ~ 1, while L/|\21(£)| < 1. Then expression 
(21.119) reduces to (problem 21.7) 


P(va > vg; L, E) 


Q 


A 2 
as — 4[Uasl” [Sag — [Ugs|?] sin? FL 
= P(Da > 09;L,E). (21.126) 


In particular, 
= z 2 2. 2 Ams, 
P(D. > De; L, E) = 1 — 4|Ue3| (1 — |Ue3|*) sin TE» (21.127) 


which can describe the survival probability of reactor ves, for example. 

Adopting a parametrization of the form (20.166), with rows labelled by 
e, u and 7, and columns by 1, 2, and 3, |U¢3|? is sin” @.3, which is found 
experimentally to be small (see the following section). It is often a good 
approximation to set |Ue3| to zero, in which case |U,,3|? = sin? 6,3. Then 
(21.126) gives the v, survival probability 


P(vy, > Vy; L, E) = P(Bu > Du; L, E) 1 — sin? 26,3 sin?(L/2d31(E)) 
(21.128) 
and the flavour-change probability 


P(vy, > v; L, E) = P(0, > D7; L, E) ~ sin? 20,3 sin? (L/2A31(E)). (21.129) 


In this approximation, P(v, — ve) = P(u — De) = 0. Formulae (21.128) 
and (21.129) can be used to describe the dominant atmospheric v, and Dp 
oscillations (see the following section), and the parameters 6,3 and Am3, (or 
Am2,) are referred to as the atmospheric mixing angle and mass squared 
difference. The smaller mass squared difference Am3,, and the angle 0.2, are 
associated with solar ve oscillations. 

The formulae (21.128) and (21.129) are, in fact, exactly what a simple 
2-state mixing model would give. Suppose that the effective mixing matrix 
for the 2-state system has the form (see problem 1.6) 


( —acos26 asin20 ) 


asin20  acos20 (21.130) 


where rows are labelled by e, js and columns by 1, 3; then the survival proba- 
bility is just 

1 — sin? 20 sin? (La), (21.131) 
where we have taken L = T as before. We can therefore identify the mixing 


parameter as 
a = [2Au (7771 = A (21.132) 
31 TE . 
Note that the energies are here measured relative to a common average energy; 


if |4) is the lighter eigenstate and |h} the heavier, then 
Iv) cos 0|£) + sin 0|h) 
Im.) =  —sin0|£) + cos 6h). (21.133) 
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21.4.3 Neutrino oscillations: experimental results 


Historically, the search for neutrino oscillations began when experiments by 
Davis et al. (1968) detected solar neutrinos (from 8B decays) at a rate approx- 
imately one third of that predicted by the solar model calculations of Bahcall 
et al. (1968). Pontecorvo (1946) had proposed the experiment, in which the 
neutrinos are detected by the inverse B-decay process Ve +°7Cl — e7 +37 Ar. 
The Davis experiment used 520 metric tons of liquid tetrachloroethylene 
(C2Cl4), buried 4850 feet underground in the Homestake gold mine, in South 
Dakota. Davis’ findings provided the impetus to study solar neutrinos us- 
ing Kamiokande, a 3000 ton imaging water Cerenkov detector situated about 
one kilometre underground in the Kamioka mine in Japan. Indeed, 8B solar 
neutrinos were observed, and at a rate consistent with that of the Davis exper- 
iment (Hirata et al. 1989). Later results from the Homestake mine (Cleveland 
et al. 1998) reported a solar neutrino detection rate almost exactly one third 
of the updated calculations of Bahcall et al. (2001). 

In a separate development, Kamiokande also reported (Hirata et al. 1988) 
an anomaly in the atmospheric neutrino flux. Atmospheric neutrinos are 
produced as decay products in hadronic showers which result from collisions 
of cosmic rays with nuclei in the upper atmosphere of the Earth. Production 
of electron and muon neutrinos is dominated by the decay chain m+ > pt + 
Vu, put + et +, + ve (and its charge-conjugate), which gives an expected 
value of about 2 for the ratio of (v, + Dp) flux to (Ve + De) flux.? While the 
number of electron-like events was in good agreement with the Monte Carlo 
calculations based on atmospheric neutrino interactions in the detector, the 
number of muon-like events was about one half of the expected number, at 
the 4o level. 

This muon-like defect (and the lack of an electron-like defect) was later 
confirmed at the 90 level by Super-Kamiokande (Fukuda et al. 1998). In this 
experiment, a marked dependence was observed on the zenith angle of the 
muon neutrinos. This angle is simply related to the distance travelled by the 
neutrinos from their point of production, which varies from about 20 km (from 
above the detector) to over 10,000 km (from below the detector). The Super- 
Kamiokande data was the first compelling evidence for neutrino oscillations. 
Interpreting their data in terms of a simple 2-state v, + v, model, as in 
(21.129), Fukuda et al. (1998) reported the values sin? 20,3 > 0.82, and 
5 x 1074 < Am3, < 6 x 107? eV? at 90% CL. 

We will postpone further discussion of the solar neutrino deficit for the 
moment, since it is complicated by interactions of the neutrinos with the 
Sun’s matter (see the following subsection). We proceed to describe some of 
the main results which have come from the analysis of data from neutrinos 
produced in terrestrial accelerators and reactors. 


3The detector could not measure the charge of the final state leptons, and therefore v 
and v events could not be discriminated. 
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We begin with the CHOOZ experiment, which was the first experiment to 
limit the value of 9.3 (Apollonio et al. 1999, 2003). CHOOZ is the name of 
a nuclear power station situated near the French village of the same name. 
The experiment was designed to detect reactor Des via the inverse P-decay 
reaction De +p > et +n. The signature was a delayed coincidence between 
the prompt e* signal, and the signal from the neutron capture. The detec- 
tor was located in an underground laboratory about 1 km from the neutrino 
source. It consisted of a central 5-ton target filled with 0.09 % Gd-doped 
liquid scintillator; Gd-doping was chosen to maximize the capture of the neu- 
trons. The neutrino energy E was a few MeV, and L was 1 km. For these 
values 2421 (Æ) is greater than about 10 km, while 2A31(£) is about 0.3 km. 
The neglect of sin? L/2A21(E) is justified, and formula (21.127) can be used 
for the De survival probability. The experiment found no evidence for De dis- 
appearance, and reported the 90% CL upper limit of sin? 20.3 < 0.19, for 
|Am3,| = 2 x 107? eV”. We shall for the moment set 0.3 to zero, and return 
to discuss its value at the end of the chapter. 

The mass squared range Am? > 2x 1073 eV? can be explored by accelerator- 
based long-baseline experiments, with typically E ~ 1 GeV and L ~ sev- 
eral hundred kilometres. The K2K (KEK-to-Kamioka) experiment was the 
first accelerator-based experiment with a neutrino path length extending hun- 
dreds of kilometres. A horn-focused wide-band v, beam with mean energy 
1.3 GeV and path length 250 km was produced by 12 GeV protons from 
the KEK-PS and directed to the Super-Kamiokande detector. In this case, 
L/2%»1(E) ~ 107?, which may be neglected. Then formulae (21.128) and 
(21.129) may be used, in the approximation U.3 = 0. The K2K data showed 
(Ahn et al. 2006) that sin? 20,3 © 1(0,3 = 7/4), and that |Am2,| had a value 
consistent with (21.125). 

The first evidence for the appearance of Ve in a Vv, beam was obtained by 
the T2K collaboration (Abe et al. 2011). The v, beam is produced using 
the high intensity proton accelerator at J-PARC, located in Tokai, Japan. 
The beam was directed 2.5° off-axis to the Super-Kamiokande detector at 
Kamioka, 295 km away. This configuration produces a narrow-band v,, beam, 
tuned at the first oscillation maximum E, = |Am3,|L/27 = 0.6 MeV, so 
as to reduce background from higher energy neutrino interactions. In the 
vacuum, the probability of the appearance of a ve in a v, beam is given (in 
our customary effective 2-state mixing approximation) by (21.126) as 


A 2 
P(vy > ve; L, E) = sin? 6,3 sin? 20.3 sin? a L; (21.134) 


P(U, > De; L, E) is given by the same expression. Taking |Am3,| = 2.4 x 
1073 eV? and sin? 26,3 = 1, the number of expected Ve events was 1.5 + 
0.3(syst.) for sin? 26.3 = 0, and 5.5 + 1.0 events if sin? 9.3 = 0.1. Six events 
were observed which passed all the ve selection criteria. As we will see in the 
following section, the value of sin? 26.3 = 0.1 is entirely consistent with direct 
measurements of this quantity reported in 2012. 
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Another long baseline accelerator experiment is MINOS at Fermilab. Neu- 
trinos are produced by the Neutrinos at the Main Injector facility (NuMI), 
using 120 GeV protons from the Fermilab main injector. The detector is a 
5.4 kton iron-scintillator tracking calorimeter with a toroidal magnetic field, 
situated underground in the Soudan mine, 735 km from Fermilab. The neu- 
trino energy spectrum from a wide-band beam is horn-focused to be en- 
hanced in the 1-5 GeV range. The current MINOS results yield |Am3,| = 
(2.327953) x 1073 eV”, and sin? 20,3 > 0.90 at 90 % CL (Adamson et al. 
2011). 

A second reactor experiment, KamLAND at Kamioka, was designed to 
be sensitive to the smaller squared mass difference Am3,, and thus to 0ez. 
The Kamioka Liquid scintillator AntiNeutrino Detector is at the site of the 
former Kamioka experiment. The detector is essentially one kiloton of highly 
purified liquid scintillator surrounded by photomultiplier tubes. des are de- 
tected as usual via the inverse B-decay reaction Pe +p —> et +n. KamLAND 
is surrounded by 55 nuclear power units, each an isotropic Pe source. The 
flux-weighted average path length is L ~ 180 km, and the energy E ranges 
from about 2 MeV to about 8 MeV. For E = 3 MeV, 2A21(E£) ~ 30 km, which 
allows for more than one oscillation. In this case (21.119) reduces to 


P(D. > De; L, E) = 1 — 4|Ver|? [Uco]? sin?(L/2Aa1(E)) (21.135) 
assuming |Ue3| ~ 0. In a parametrization of the form (20.166), this becomes 
P(D. > De; L, E) = 1 — sin? 26.2 sin?(L/2A21(E)), (21.136) 


again a simple 2-state mixing result. Data shown in figure 21.11 (Abe et al. 
2008) gives 


[Amâ | = o x 107 eV? (21.137) 
tan? 0.) = 0.56t9:10+0.10, (21.138) 


The KamLAND data showed for the first time the periodic behaviour of the 
De survival probability. 

We now return to the solar neutrino problem, taking up the story after 
Davis’ results. Some doubts remained as to whether the solar calculations 
could be absolutely relied upon, for example because of the extreme sensitivity 
to the core temperature (x T18). One particular class of Ve could, however, 
be reliably calculated, namely those associated with the initial reaction pp > 
2H +e* +1 of the pp cycle. Whereas the Davis experiments allowed detection 
of the higher energy ves (threshold 814 keV) from the B and Be stages of 
the cycle, the energy of the ves from the pp stage cuts off at around 400 
keV. Detectors using the reaction Ve +"! Ga — e” +" Ge, which has a 233 
keV energy threshold, were built (GALLEX, GNO and SAGE); their results 
(Altman et al. 2005, Abdurashitov et al. 2009) are in agreement, and again 
much smaller than the (updated) Bahcall et al. (2005) prediction. 
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FIGURE 21.11 

Ratio of the background and geo-neutrino subtracted Pe spectrum to the ex- 
pectation for no-oscillation, as a function of Lo/E, where Lo = 180 km. Figure 
reprinted with permission from S Abe et al. (KamLAND Collaboration) Phys. 
Rev. Lett. 100 221803 (2008). Copyright 2008 by the American Physical So- 
ciety. 


In 1999, the Sudbury Neutrino Observatory (SNO) in Canada began ob- 
servation. This experiment used 1 kiloton of ultra-pure heavy water (D20). 
It measured 8B solar ves via both the CC reaction Ve +d =>e7 + p + p, and 
the NC reaction v +d —> v +p +n, as well as elastic ve” scattering. The CC 
reaction is sensitive only to ve, while the NC reaction is sensitive to all active 
neutrinos, as is ve” scattering. If the solar neutrino deficit were caused by 
neutrino oscillations, the solar neutrino fluxes measured by the CC and NC 
reactions would be significantly different. SNO found that, while the total 
neutrino flux was consistent with solar model expectations, the ratio of the 
ve flux to the total neutrino flux was about 1/3 (Ahmad et al. 2001, 2002). 
This number can be understood in terms of the effect of dense matter on the 
propagation of the ves, as we now discuss. 


21.4.4 Matter effects in neutrino oscillations 


We have assumed in the foregoing that neutrinos propagate in vacuum be- 
tween the source and the detector. Since neutrinos interact only weakly, it 
might seem that this is always an excellent approximation. But in the same 
way that light travelling through a transparent medium can have its refractive 
index changed, so can a neutrino. In particular, the refractive index can be 
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different for Ve and v,. The difference in refractive indices is determined by 
the difference in the real parts of the forward ee” and v,e” elastic scatter- 
ing amplitudes (Wolfenstein 1978). The essential point is that the scattering 
can be coherent, with the spins and momenta of the particles remaining un- 
changed. This means that the effect is going to be proportional to the density 
of electrons in the matter traversed, Ne. The scattering amplitude, in turn, is 
proportional to Gp, so that a figure of merit for the effect is given by the prod- 
uct Gr Ne. This has the dimensions of an energy, and can be interpreted as an 
addition to the effective 2-state mixing matrix of (21.130). Detailed analysis, 
which we omit, shows that the correct addition is actually +V/2Gp Ne, so that 
(21.130) is modified to 


—Am cos20 + Y2GpN. AM sin 20 
AO XE, ; (21.139) 

Ties sin 20 AE cos 20 
where now Am? = m3 — mí, and 0 = 02. Two-state mixing now gives 

problem 21.8) a new mixing angle Om such that 
blem 21.8 ixing angle 0, such th 
tan 20 Am? cos 20 

tan 20, = —— Se (21.140) 


1 ati Aa / Nees’ A 2/2GrE i 
and the mass eigenstates |1),,,,|2),, correspond to the eigenvalue difference 
ma — ma = ¡Am?, /2E| [cos? 20(1 — Ne/Nres)* + sin? 20]1/2. (21.141) 


We see that although the new term is certainly very small, being propor- 
tional to Gp, nevertheless since Am? is very small also, a significant effect can 
occur. In particular, if it should happen that Ne = Nres for some (0, E), then 
Om will be ‘maximal’ (Om = 7/4), irrespective of the value of the original 0. 
This is called ‘resonant mixing’ (Mikheev and Smirnov 1985, 1986). It implies 
that the probability for a Ve — Y, flavour change could be greatly enhanced 
over the vacuum value, which is proportional to sin? 20.2. A point to note, 
also, is that the corresponding formulae for Ves are obtained by replacing Ne 
by —N,; then, depending on the sign of Am? cos 26.2, resonant mixing can 
occur for one or the other of ve or De as they pass through matter, but not 
both. Similar considerations apply to the propagation of neutrinos through 
the earth, but we shall not pursue this here (see Nakamura and Petcov in 
Nakamura et al. 2010). 

In the case of solar neutrinos, the effect of the above modifications is quite 
simple. For the highest energy neutrinos, Ne > Nyes at the centre of the 
sun, so that Om ~ 7/2 at production in the core, and the ve is in the heavier 
mass state |2),,. On the way to the surface of the Sun, Ne will decrease, and 
a point will be reached when Ne = Nes. Here the mass difference (21.141) 
reaches its minimum, and two limiting cases may be distinguished depending 
on the scale of the variation in the electron density, which has been assumed 
constant in (21.139)-(21.141). (i) If the density variation is slow enough that 
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at least one oscillation length fits into the resonant density region, then it 
can be shown that the state stays with state |2),, (‘adiabatic evolution’) until 
it reaches the surface of the Sun, when Om — 0.7. The probability that the 
neutrino will survive to the earth is then (using (21.133)) |(ve|2) m|? = sin? 0.2, 
which has a value of about 1/3. In the alternative limit, (ii), in which the 
oscillation length in matter is relatively large with respect to the scale of 
density variation, the state may ‘jump’ to the other mass state |1),, (‘extreme 
non-adiabatic evolution’), and then |(v.{1)m|? = cos? e2. These are clearly 
extreme cases, and numerical work is required in the general case. However, 
the data from SNO and other water Cerenkov detectors are consistent with 
the first (adiabatic) alternative, and with the value sin? 0.2 ~ 1/3. Note that 
the solar data imply that (m3 — m?) cos 20.2 > 0. 

By contrast, for the lowest energy neutrinos we can take m = 0, so that the 
neutrinos are produced in the state cos 6¢2|1) +sin 0e2]2), and propagate as in a 
vacuum, oscillating with maximum excursion sin? 26.2. The detectors average 
over many oscillations, giving a factor of 1/2, so that the survival probability 
for the low energy ves is 1 — 4 sin” 20.2 ~ 5/9. The Gallium experiments are 
sensitive to the lower energy neutrinos, and indeed record some 60-70% of the 
expected flux. 

In summary, the solar neutrino data are consistent with the interpretation 
in terms of neutrino oscillations, as modified by the Wolfenstein-Mikheev- 
Smirnov (MSW) effect. A global solar + KamLAND analysis yields best fit 
values (Aharmim et al. 2010) 


e2 = 34.061746 Amey = 7.5055) x 1075 eV". (21.142) 


21.4.5 Further developments 


Despite the remarkable experimental progress in the studies of neutrino oscil- 
lations over the last decade, there still remain some basic gaps in our knowl- 
edge. Perhaps the most fundamental is the Dirac/Majorana nature of massive 
neutrinos. The most feasible (but very difficult) test is neutrinoless double 
B-decay (0v88-decay), already touched on in section 20.3. As noted there, the 
amplitude is proportional to an average Majorana mass parameter (m). Ex- 
periments place a lower bound on the half-life for the decay, which translates 
into an upper bound on (m). The most stringent lower bounds on the half- 
lives have been obtained with decays of "Ge (Klapdor-Kleingrothaus et al. 
2001), 1%Te (Andreotti et al. 2011) and 1%Mo (Arnold et al. 2006). Lower 
bounds on the half-lives range from 1074 to 1025 years, with corresponding 
upper bounds on (m) of the order of 0.5 eV. It should, however, be noted that 
some participants of the Heidelberg-Moscow experiment claimed the observa- 
tion of 0vB6 decay of Ge with a half-life of 2.23 0-41 x 10% years, from which 
they deduced (m) = 0.32 + 0.03 eV (Klapdor-Kleingrothaus et al. 2006). The 
GERDA experiment (Ur et al. 2011) should be able to check this claim after 
one year of running. Other experiments currently running, or planned, will 


364 21. CP Violation and Oscillation Phenomena 


push the bound on half-lives up to 1025-1027 years, and the upper bound on 
(m) down to magnitudes of the order of a few times 10~? eV. 

A second crucial question concerns the magnitude of CP-violation effects 
in neutrino oscillations. We recall from (20.172) that this vanishes if sin 0.3 = 
0. As we saw earlier, CHOOZ set a 90% CL limit sin? 20.3 < 0.17. A non-zero 
value of sin? 20,3 has now been observed by two groups, both i disappearance 
experiments: the Daya Bay collaboration (An et al. 2012) and the RENO 
collaboration (Ahn et al. 2012). Their reported results were 


sin? 20.3 = 0.09240.016+0.005 (Daya Bay) (21.143) 
sin? 20.3 = 0.113+0.013+0.019 (RENO), (21.144) 


in a 3-neutrino framework. For this value of sin 0.3, it should be possible to 
detect a CP-violating difference in the probabilities for v, > Ve and Dp — Ve, 
and it may be enough to sustain leptogenesis models. 

The value of sin 6.3 is also relevant to the determination of the sign of 
Am3,; we shall mention just one possibility. We have seen that the MSW effect 
for solar neutrinos implies that ma > ma (using the fact that cos 0.2 > 0), but 
the mass spectrum (for 3-neutrino mixing) could be ordered as mı < ma < m3 
(‘normal spectrum’) or as m3 < mi < ma (‘inverted spectrum’). We have 
ignored the terrestrial MSW effect, but it can be significant in long-baseline 
accelerator-based experiments, and could be exploited to determine the sign 
of m3 — ma. In the vacuum, the probability of the appearance of a ve in 
a Y, beam is given by (21.134) (in our customary effective 2-state mixing 
approximation). As in the solar case, these probabilities will be modified by 
the MSW effect, which will enhance (suppress) the appearance probability for 
neutrinos (antineutrinos) in the case of the normal spectrum, and vice versa 
for the inverted spectrum. Clearly if ez were too small, the effect would be 
very hard to see, but the value in (21.143) and (21.144) makes this a realistic 
experiment; it formed part of the physics motivation for the NOvA experiment 
at Fermilab (Ayres et al. 2005). NOvA is a long-baseline neutrino oscillation 
experiment now under construction, which aims to detect the appearance of 
Ve and De in the NuMI muon neutrino beam. The beam from Fermilab is 
directed 14 mrad off-axis to a detector 810 kn away; the neutrino energy is 
narrowly peaked around 2.2 GeV. NOvA will also have sensitivity to leptonic 
CP-violation. 


e 
Problems 

21.1 Verify equation (21.34). 

21.2 Verify equations (21.46) and (21.47). 

20.3 Verify equations (21.48) and (21.49). 
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21.4 Verify equations (21.56). 
21.5 Verify equations (21.89) and (21.90). 
21.6 Verify equation (21.119 
21.7 Verify equation (21.126 


21.8 Verify equations (21.140) and (21.141). 
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The Glashow-Salam- Weinberg Gauge 
Theory of Electroweak Interactions 


22.1 Difficulties with the current—current and ‘naive’ IVB 
models 


In chapter 20 we developed the “V-A current—current’ phenomenology of weak 
interactions. We saw that this gives a remarkably accurate account of a wide 
range of data — so much so, in fact, that one might well wonder why it should 
not be regarded as a fully-fledged theory. One good reason for wanting to do 
this would be in order to carry out calculations beyond the lowest order, which 
is essentially all we have used it for so far (with the significant exceptions of 
the GIM argument, and box diagrams in M-M mixing). Such higher-order cal- 
culations are indeed required by the precision attained in modern high energy 
experiments. But the electroweak theory of Glashow, Salam and Weinberg, 
now recognized as one of the pillars of the Standard Model, was formulated 
long before such precision measurements existed, under the impetus of quite 
compelling theoretical arguments. These had to do, mainly, with certain in- 
principle difficulties associated with the current-current model, if viewed as a 
‘theory’. Since we now believe that the GSW theory is the correct description 
of electroweak interactions up to currently tested energies, further discussions 
of these old issues concerning the current-current model might seem irrele- 
vant. However, these difficulties do raise several important points of principle. 
An understanding of them provides valuable motivation for the GSW theory 
— and some idea of what is ‘at stake’ in regard to experiments relating to 
the Higgs sector, which has only recently begun to be explored (see section 
22.8.3). 

Before reviewing the difficulties, however, it is worth emphasizing once 
again a more positive motivation for a gauge theory of weak interactions 
(Glashow 1961). This is the remarkable ‘universality’ structure noted in chap- 
ter 20, not only as between different types of lepton, but also (within the con- 
text of CKM mixing) between the quarks and the leptons. This recalls very 
strongly the ‘universality’ property of QED, and the generalization of this 
property in the non-Abelian theories of chapter 13. A gauge theory would 
provide a natural framework for such universal couplings. 
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FIGURE 22.1 
Current-current amplitude for 7, + 4" —> e+e. 


22.1.1 Violations of unitarity 


We have seen several examples, in chapter 20, in which cross sections were 
predicted to rise indefinitely as a function of the invariant variable s, which 
is the square of the total energy in the CM frame. We begin by showing why 
this is ultimately an unacceptable behaviour. 

Consider the process (figure 22.1) 


Du + UT > De + e7 (22.1) 


in the current-current model, regarding it as fundamental interaction, treated 
to lowest order in perturbation theory. A similar process was discussed in 
chapter 20. Since the troubles we shall find occur at high energies, we can 
simplify the expressions by neglecting the lepton masses without altering the 
conclusions. In this limit the invariant amplitude is (problem 22.1), up to a 
numerical factor, 

M = GpE2(1 + cos0) (22.2) 


where E is the CM energy, and 6 is the CM scattering angle of the e” with re- 
spect to the direction of the incident y”. This leads to the following behaviour 
of the cross section (cf (20.83), remembering that s = 4E7): 


o ~ GRE’. (22.3) 


The dependence on E? is a consequence of the fact that Gp is not di- 
mensionless, having the dimensions of [M]~*. Its value is (Nakamura et al. 
2010) 

Gr = 1.16637(1) x 107? GeV~?. (22.4) 


The cross section has dimensions of [L]? = [M]~?, but must involve G} which 
has dimension [M]7*. It must also be relativistically invariant. At energies 
well above lepton masses, the only invariant quantity available to restore the 
correct dimensions to o is s, the square of the CM energy E, so that o ~ GpE?. 

Consider now a partial wave analysis of this process. For spinless particles 
the total cross section may be written as a sum of partial wave cross sections 


AT 


EA 
J 


o (2J +1)| fal? (22.5) 
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where fyz is the partial wave amplitude for angular momentum J and k is the 
CM momentum. It is a consequence of unitarity, or flux conservation (see, for 
example, Merzbacher 1998, chapter 13), that the partial wave amplitude may 
be written in terms of a phase shift ôy: 


fr =e sinó, (22.6) 


so that 
fi (22.7) 


Thus the cross section in each partial wave is bounded by 
oy < 47(2J + 1)/k2 (22.8) 


which falls as the CM energy rises. By contrast, in (22.3) we have a cross 
section that rises with CM energy: 


on P’. (22.9) 


Moreover, since the amplitude (equation (22.2)) only involves (cos6) and 
(cos 6)! contributions, it is clear that this rise in o is associated with only a 
few partial waves, and is not due to more and more partial waves contributing 
to the sum in o. Therefore, at some energy E, the unitarity bound will be 
violated by this lowest-order (Born approximation) expression for o. 

This is the essence of the ‘unitarity disease’ of the current-current model. 
To fill in all the details, however, involves a careful treatment of the appropri- 
ate partial wave analysis for the case when all particles carry spin. We shall 
avoid those details. Instead we argue, again on dimensional grounds, that the 
dimensionless partial wave amplitude f (note the 1/k? factor in (22.5)) must 
be proportional to GpE?, which violates the bound (22.7) for CM energies 


E > G}? ~ 300GeV. (22.10) 


At this point the reader may recall a very similar-sounding argument made 
in section 11.8, which led to the same estimate of the ‘dangerous’ energy scale 
(22.10). In that case, the discussion referred to a hypothetical ‘4-fermi’ inter- 
action without the V-A structure, and it was concerned with renormalization 
rather than unitarity. The gamma-matrix structure is irrelevant to these is- 
sues, which ultimately have to do with the dimensionality of the coupling 
constant, in both cases. In fact, as we shall see, unitarity and renormalizabil- 
ity are actually rather closely related. 

Faced with this unitarity difficulty, we appeal to the most successful theory 
we have, and ask: what happens in QED? We consider an apparently quite 
similar process, namely ete” > u" u` in lowest order (figure 22.2). In chapter 
8 the total cross section for this process, neglecting lepton masses, was found 
to be (see problem 8.18 and equation (9.87)) 


o = 4ro?/3E* (22.11) 
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FIGURE 22.2 
One-photon annihilation graph for ete” > pt pm. 


which obediently falls with energy as required by unitarity. In this case the 
coupling constant a, analogous to Gp, is dimensionless, so that a factor E? is 
required in the denominator to give o ~ [L]?. 

If we accept this clue from QED, we are led to search for a theory of 
weak interactions that involves a dimensionless coupling constant. Press- 
ing the analogy with QED further will help us to see how one might arise. 
Fermi's current-current model was, as we said, motivated by the vector cur- 
rents of QED. But in Fermi’s case the currents interact directly with each 
other, whereas in QED they interact only indirectly via the mediation of the 
electromagnetic field. More formally, the Fermi current-current interaction 
has the ‘four point’ structure 


‘Gr (dh): (1) (22.12) 
while QED has the ‘three-point’ (Yukawa) structure 
tehh Â. (22.13) 


Dimensional analysis easily shows, once again, that [Gp] = M~? while [e] = 
MO. This strongly suggests that we should take Fermi’s analogy further, and 
look for a weak interaction analogue of (22.13), having the form 


gb (22.14) 


where W is a bosonic field. Dimensional analysis shows, of course, that [9] = 
M°. 

Since the weak currents are in fact vector-like, we must assume that the W 
fields are also vectors (spin-1) so as to make (22.14) Lorentz invariant. And 
because the weak interactions are plainly not long-range, like electromagnetic 
ones, the mass of the W quanta cannot be zero. So we are led to postulate 
the existence of a massive weak analogue of the photon, the “intermediate 
vector boson’ (IVB), and to suppose that weak interactions are mediated by 
the exchange of IVB's. 

There is, of course, one further difference with electromagnetism, which 
is that the currents in B-decay, for example, carry charge (e.g. y" (1 — 
15)». creates negative charge or destroys positive charge). The ‘companion’ 
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FIGURE 22.3 
One-W~ annihilation graph for Dp +” —> De +e. 


hadronic current carries the opposite charge (e.g. Pull — rys ha destroys 
negative charge or creates positive charge), so as to make the total effective 
interaction charge-conserving, as required. It follows that the W fields must 
then be charged, so that expressions of the form (22.14) are neutral. Because 
both charge-raising and charge-lowering currents exist, we need both W* and 
W-. The reaction (22.1), for example, is then conceived as proceeding via 
the Feynman diagram shown in figure 22.3, quite analogous to figure 22.2. 

Because we also have weak neutral currents, we need a neutral vector 
boson as well, Z°. In addition to all these, there is the familiar massless 
neutral vector boson, the photon. Despite the fact that they are not massless, 
the WË and ZO can be understood as gauge quanta, thanks to the symmetry- 
breaking mechanism explained in section 19.6. For the moment, however, we 
are going to follow a more scenic route, and accept (as Glashow did in 1961) 
that we are dealing with ordinary ‘unsophisticated’ massive vector particles, 
charged and uncharged. 

We now investigate whether the IVB model can do any better with unitar- 
ity than the current-current model. The analysis will bear a close similarity 
to the discussion of the renormalizability of the model in section 19.1, and we 
shall take up that issue again in section 22.1.2. 

The unitarity-violating processes turn out to be those involving external 
W particles. Consider, for example, the process 


Vy +d, > WT 4 WI (22.15) 


proceeding via the graph shown in figure 22.4. The fact that this is experimen- 
tally a somewhat esoteric reaction is irrelevant for the subsequent argument: 
the proposed theory, represented by the IVB modification of the four-fermion 
model, will necessarily generate the amplitude shown in figure 22.4, and since 
this amplitude violates unitarity, the theory is unacceptable. The amplitude 
for this process is proportional to 


Marr = 976, (ka, Ade” (ki, A)0(92) "(1 — 95) 
x AAR + 21 — ju) (22.16) 


(pi — ki)? — mé, 


where the e* are the polarization vectors of the W’s: ¢,*(k2,Az2) is that 
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FIGURE 22.4 
y” -exchange graph for v, + Du > Wt + W5. 


associated with the outgoing W” with 4-momentum ka and polarization state 
A2, and similarly for e}*. 

To calculate the total cross section, we must form | M|? and sum over the 
three states of polarization for each of the W’s. To do this, we need the result 


So eulk, AJ& (k, A) = — Gav + Ruku/Mă (22.17) 
A=0,+1 


already given in (19.19). Our interest will as usual be in the high-energy 
behaviour of the cross section, in which regime it is clear that the k,,k,/M¢ 
term in (22.17) will dominate the gy, term. It is therefore worth looking 
a little more closely at this term. From (19.17) and (19.18) we see that 
in a frame in which k” = (k°,0,0,|k|), the transverse polarization vectors 
e*(k,A = +1) involve no momentum dependence, which is in fact carried 
solely in the longitudinal polarization vector e*(k, A = 0). We may write this 
as 


pr Mw 


m+ Tey ob) 
which at high energy tends to k*/Mw. Thus it is clear that it is the lon- 
gitudinal polarization states which are responsible for the k*k” parts of the 
polarization sum (12.21), and which will dominate real production of W’s at 
high energy. 
Concentrating therefore on the production of longitudinal W’s, we are led 
to examine the quantity 


4 


mp Ky Ele — 5)(Pi— Ki) Kı Da Kı(Øı— Kı) K2 po] (22.19) 


where we have neglected m,,, commuted the (1 — y5) factors through, and ne- 
glected neutrino masses, in forming}. ins |Moo|?. Retaining only the leading 
powers of energy, we find (see problem 22.2) 


XC [Moo]? ~ (94/Mw)(p1 - k2)(p2 « ka) = (9*/Miy)E*(1— cos? 0) (22.20) 


spins 
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where E is the CM energy and 0 the CM scattering angle. We see that the 
(unsquared) amplitude must behave essentially as g?E?/M¢,, the quantity 
92/M3, effectively replacing Gp of the current-current model. The unitarity 
bound is violated for E > Mw/g ~ 300 GeV, taking g ~ e. 

Other unitarity-violating processes can easily be invented, and we have to 
conclude that the IVB model is, in this respect, no more fitted to be called a 
theory than was the four-fermion model. In the case of the latter, we argued 
that the root of the disease lay in the fact that Gp was not dimensionless, yet 
somehow this was not a good enough cure after all: perhaps (it is indeed so) 
‘dimensionlessness’ is necessary but not sufficient (see the following section). 
Why is this? Returning to Ma, 1, for vy + WtW- (equation (22.16)) and 
setting e = k,,/M for the longitudinal polarization vectors, we see that we are 
involved with an effective amplitude 


o bl — a Kl -hup (22.21) 


v(p2) K2(1 — y5)u(p1)- (22.22) 


We see that the longitudinal e's have brought in the factors Mg, which are 
‘compensated’ by the factor K2, and it is this latter factor which causes the rise 
with energy. The longitudinal polarization states have effectively reintroduced 
a dimensional coupling constant g/Mw. 

What happens in QED? We learnt in section 7.3 that, for real photons, 
the longitudinal state of polarization is absent altogether. We might well 
suspect, therefore, that since it was the longitudinal W’s that caused the ‘bad’ 
high-energy behaviour of the IVB model, the ‘good’ high-energy behaviour of 
QED might have its origin in the absence of such states for photons. And 
this circumstance can, in its turn, be traced (cf section 7.3.1 ) to the gauge 
invariance property of QED. 

Indeed, in section 8.6.3 we saw that in the analogue of (22.17) for photons 
(this time involving only the two transverse polarization states), the right- 
hand side could be taken to be just -g,,, provided that the Ward identity 
(8.166) held, a condition directly following from gauge invariance. 

We have arrived here at an important theoretical indication that what we 
really need is a gauge theory of the weak interactions, in which the W’s are 
gauge quanta. It must, however, be a peculiar kind of gauge theory, since 
normally gauge invariance requires the gauge field quanta to be massless. 
However, we have already seen how this ‘peculiarity’ can indeed arise, if the 
local symmetry is spontaneously broken (chapter 19). But before proceeding 
to implement that idea, in the GSW theory, we discuss one further disease 
(related to the unitarity one) possessed by both current-current and IVB 
models — that of non-renormalizability. 
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FIGURE 22.5 
O(g*) contribution to vu Du > VD. 


22.1.2 The problem of non-renormalizability in weak 
interactions 


The preceding line of argument about unitarity violations is open to the follow- 
ing objection. It is an argument conducted entirely within the framework of 
perturbation theory. What it shows, in fact, is simply that perturbation theory 
must fail, in theories of the type considered, at some sufficiently high energy. 
The essential reason is that the effective expansion parameter for perturbation 
theory is EGY ?. Since EGY ? becomes large at high energy, arguments based 
on lowest-order perturbation theory are irrelevant. The objection is perfectly 
valid, and we shall take account of it by linking high-energy behaviour to the 
problem of renormalizability, rather than unitarity. We might, however, just 
note in passing that yet another way of stating the results of the previous two 
sections is to say that, for both the current-current and IVB theories, ‘weak 
interactions become strong at energies of order 1 TeV’. 

We gave an elementary introduction to renormalization in chapters 10 and 
11 of volume 1. In particular, we discussed in some detail, in section 11.8, 
the difficulties that arise when one tries to do higher-order calculations in 
the case of a four-fermion interaction with the same form (apart from the 
V-A structure) as the current-current model. Its coupling constant, which we 
called Gp, also had dimension (mass)~?. The ‘non-renormalizable’ problem 
was essentially that, as one approached the ‘dangerous’ energy scale (22.10), 
one needed to supply the values of an ever-increasing number of parameters 
from experiment, and the theory lost predictive power. 

Does the IVB model fare any better? In this case, the coupling constant 
is dimensionless, just as in QED. ‘Dimensionlessness’ alone is not enough, it 
turns out: the IVB model is not renormalizable either. We gave an indication 
of why this is so in section 19.1, but we shall now be somewhat more specific, 
relating the discussion to the previous one about unitarity. 

Consider, for example, the fourth-order processes shown in figure 22.5, for 
the IVB-mediated process vu, — vu. It seems plausible from the diagram 
that the amplitude must be formed by somehow ‘sticking together’ two copies 


22.1. Difficulties with the current-current and ‘naive’ IVB models 375 


(a) (b) 


FIGURE 22.6 
O(e*) contributions to ete” > ete7. 


(a) (b) 


FIGURE 22.7 
Lowest-order amplitudes for ete” => yy: (a) direct graph, (b) crossed graph. 


of the tree graph shown in figure 22.4.1 Now we saw that the high-energy 
behaviour of the amplitude vd > WtW- (figure 22.4) grows as E?, due to 
the k dependence of the longitudinal polarization vectors, and this turns out 
to produce, via figure 22.5, a non-renormalizable divergence, for the reason 
indicated in section 19.1 — namely, the ‘bad’ behaviour of the k” k” /M¢, factors 
in the W-propagators, at large k. 

So it is plain that, once again, the blame lies with the longitudinal polar- 
ization states for the W's. Let us see how QED -— a renormalizable theory — 
manages to avoid this problem. In this case, there are two box graphs, shown 
in figures 22.6. There are also two corresponding tree graphs, shown in figures 
22.7(a) and (b). Consider, therefore mimicking for figures 22.7(a) and (b) the 
calculation we did for figure 22.4. We would obtain the leading high-energy 
behaviour by replacing the photon polarization vectors by the corresponding 
momenta, and it can be checked (problem 21.3) that when this replacement 


1The reader may here usefully recall the discussion of unitarity for one-loop graphs in 
section 13.3.3. 
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FIGURE 22.8 
Four-point ete” vertex. 


FIGURE 22.9 
Four-point vv vertex. 


is made for each photon the complete amplitude for the sum of figures 22.7(a) 
and (b) vanishes. 

In physical terms, of course, this result was expected, since we knew in 
advance that it is always possible to choose polarization vectors for real pho- 
tons such that they are purely transverse, so that no physical process can 
depend on a part of e,, proportional to k,. Nevertheless, the calculation is 
highly relevant to the question of renormalizing the graphs in figure 22.6. The 
photons in this process are not real external particles, but are instead virtual, 
internal ones. This has the consequence that we should in general include 
their longitudinal (e, x k,) states as well as the transverse ones (see section 
13.3.3 for something similar in the case of unitarity for 1-loop diagrams). The 
calculation of problem 22.3 then suggests that these longitudinal states are 
harmless, provided that both contributions in figure 22.7 are included. 

Indeed, the sum of these two box graphs for ete” — ete” is not diver- 
gent. If it were, an infinite counter term proportional to a four-point vertex 
ete” — ete” (figure 22.8) would have to be introduced, and the original 
QED theory, which of course lacks such a fundamental interaction, would not 
be renormalizable. This is exactly what does happen in the case of figure 
22.5. The bad high-energy behaviour of vy > WtW- translates into a diver- 
gence of figure 22.5 — and this time there is no ‘crossed’ amplitude to cancel 
it. This divergence entails the introduction of a new vertex, figure 22.9, not 
present in the original IVB theory. Thus the theory without this vertex is non- 
renormalizable — and if we include it, we are landed with a four-field pointlike 
vertex which is non-renormalizable, as in the Fermi (current-current) case. 
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Our presentation hitherto has emphasized the fact that, in QED, the bad 
high-energy behaviour is rendered harmless by a cancellation between contri- 
butions from figures 22.7(a) and (b) (or figures 22.6(a) and (b)). Thus one 
way to ‘fix up’ the IVB theory might be to hypothesize a new physical process, 
to be added to figure 22.4, in such a way that a cancellation occurred at high 
energies. The search for such high-energy cancellation mechanisms can indeed 
be pushed to a successful conclusion (Llewellyn Smith 1973), given sufficient 
ingenuity and, arguably, a little hindsight. However, we are in possession of 
a more powerful principle. In QED, we have already seen (section 8.6.2) that 
the vanishing of amplitudes when an e, is replaced by the corresponding k, is 
due to gauge invariance: in other words, the potentially harmful longitudinal 
polarization states are in fact harmless in a gauge-invariant theory. 

We have therefore arrived once more, after a somewhat more leisurely 
discussion than that of section 19.1, at the idea that we need a gauge theory 
of massive vector bosons, so that the offending k*k” part of the propagator can 
be ‘gauged away’ as in the photon case. This is precisely what is provided by 
the ‘spontaneously broken’ gauge theory concept, as developed in chapter 19. 
There we saw that, taking the U(1) case for simplicity, the general expression 
for the gauge boson propagator in such a theory (in a ’t Hooft gauge) is 


i gh + | Je — M2, + ie) (22.23) 


where £ is a gauge parameter. Our IVB propagator corresponds to the € > 
oo limit, and with this choice of € all the troubles we have been discussing 
appear to be present. But for any finite € (for example £ = 1) the high- 
energy behaviour of the propagator is actually ~ 1/k?, the same as in the 
renormalizable QED case. This strongly suggests that such theories — in 
particular non-Abelian ones — are in fact renormalizable. ’t Hooft’s proof 
that they are (’t Hooft 1971b) triggered an explosion of theoretical work, as it 
became clear that, for the first time, it would be possible to make higher-order 
calculations for weak interaction processes using consistent renormalization 
procedures, of the kind that had worked so well for QED. 

We now have all the pieces in place, and can proceed to introduce the 
GSW theory, based on the local gauge symmetry of SU(2) x U(1). 


| 


22.2 The SU(2) x U(1) electroweak gauge theory 


22.2.1 Quantum number assignments; Higgs, W and Z 
masses 


Given the preceding motivations for considering a gauge theory of weak in- 
teractions, the remaining question is this: what is the relevant symmetry 
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group of local phase transformations, i.e. the relevant weak gauge group? Sev- 
eral possibilities were suggested, but it is now very well established that the 
one originally proposed by Glashow (1961), subsequently treated as a spon- 
taneously broken gauge symmetry by Weinberg (1967) and by Salam (1968), 
and later extended by other authors, produces a theory which is in remarkable 
agreement with currently known data. We shall not give a critical review of 
all the experimental evidence, but instead proceed directly to an outline of 
the GSW theory, introducing elements of the data at illustrative points. 

An important clue to the symmetry group involved in the weak interac- 
tions is provided by considering the transitions induced by these interactions. 
This is somewhat analogous to discovering the multiplet structure of atomic 
levels and hence the representations of the rotation group, a prominent sym- 
metry of the Schrodinger equation, by studying electromagnetic transitions. 
However, there is one very important difference between the ‘weak multiplets’ 
we shall be considering, and those associated with symmetries which are not 
spontaneously broken. We saw in chapter 12 how an unbroken non-Abelian 
symmetry leads to multiplets of states which are degenerate in mass, but in 
section 17.1 we learned that that result only holds provided the vacuum is 
left invariant under the symmetry transformation. When the symmetry is 
spontaneously broken, the vacuum is not invariant, and we must expect that 
the degenerate multiplet structure will then, in general, disappear completely. 
This is precisely the situation in the electroweak theory. 

Nevertheless, as we shall see, essential consequences of the weak symme- 
try group — specifically, the relations it requires between otherwise unrelated 
masses and couplings — are accessible to experiment. Moreover, despite the 
fact that members of a multiplet of a global symmetry which is spontaneously 
broken will, in general, no longer have even approximately the same mass, 
the concept of a multiplet is still useful. This is because when the symmetry 
is made a local one, we shall find (in sections 22.2.2 and 22.2.3) that the as- 
sociated gauge quanta still mediate interactions between members of a given 
symmetry multiplet, just as in the manifest local non-Abelian symmetry ex- 
ample of QCD. Now, the leptonic transitions associated with the weak charged 
currents are, as we saw in chapter 20, Ve + e, V, + u etc. This suggests that 
these pairs should be regarded as doublets under some group. Further we 
saw in section 20.7 how weak transitions involving charged quarks suggested 
a similar doublet structure for them also. The simplest possibility is there- 
fore to suppose that, in both cases, a ‘weak SU(2) group’ is involved, called 
‘weak isospin’. We emphasize once more that this weak isospin is distinct 
from the hadronic isospin of chapter 12, which is part of SU(3)f. We use the 
symbols t, tz for the quantum numbers of weak isospin, and make the specific 
assignments for the leptonic fields 


pal t3 = +1/2 y, 2A A 
T ta = —1/2 e au ÂT Ja’ 7 


II 
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where êr = $(1 — y5)é etc, and for the quark fields 


“hE De De Dee 


As discussed in section 20.2.2, the subscript ‘L’ refers to the fact that only the 
left-handed chiral components of the fields enter, in consequence of the V-A 
structure. For this reason, the weak isospin group is referred to as SU(2)L, 
to show that the weak isospin assignments and corresponding transformation 
properties apply only to these left-handed parts. Notice that, as anticipated 
for a spontaneously broken symmetry, these doublets all involve pairs of parti- 
cles which are not mass degenerate. In (22.24) and (22.25), the primes indicate 
that these fields are related to the (unprimed) fields of definite mass by the 
unitary matrices U (for neutrinos) and V (for quarks), as discussed in sections 
21.4.1 and 20.7.3 respectively. 

Making this SU(2) into a local phase invariance (following the logic of 
chapter 13) will entail the introduction of three gauge fields, transforming as 
a t = 1 multiplet (a triplet) under the group. Because (as with the ordi- 
nary SU(2) of hadronic isospin) the members of a weak isodoublet differ by 
one unit of charge, the two gauge fields associated with transitions between 
doublet members will have charge +1. The quanta of these fields will, of 
course, be the now familiar WF bosons mediating the charged current tran- 
sitions, and associated with the weak isospin raising and lowering operators 
t+. What about the third gauge boson of the triplet? This will be electrically 
neutral, and a very economical and appealing idea would be to associate this 
neutral vector particle with the photon, thereby unifying the weak and elec- 
tromagnetic interactions. A model of this kind was originally suggested by 
Schwinger (1957). Of course, the W’s must somehow acquire mass, while the 
photon remains massless. Schwinger arranged this by introducing appropri- 
ate couplings of the vector bosons to additional scalar and pseudoscalar fields. 
These couplings were arbitrary and no prediction of the W masses could be 
made. We now believe, following the arguments of the preceding section, 
that the W mass must arise via the spontaneous breakdown of a non-Abelian 
gauge symmetry, and as we saw in section 19.6, this does constrain the W 
mass. 

Apart from the question of the W mass in Schwinger’s model, we now 
know (see chapter 20) that there exist neutral current weak interactions, in 
addition to those of the charged currents. We must also include these in our 
emerging gauge theory, and an obvious suggestion is to have these currents 
mediated by the neutral member W° of the SU(2)L gauge field triplet. Such a 
scheme was indeed proposed by Bludman (1958), again pre-Higgs, so that W 
masses were put in ‘by hand’. In this model, however, the neutral currents will 
have the same pure left-handed V—A structure as the charged currents: but, 
as we saw in chapter 20, the neutral currents are not pure V-A. Furthermore, 
the attractive feature of including the photon, and thus unifying weak and 
electromagnetic interactions, has been lost. 
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A key contribution was made by Glashow (1961); similar ideas were also 
advanced by Salam and Ward (1964). Glashow suggested enlarging the 
Schwinger-Bludman SU(2) schemes by inclusion of an additional U(1) gauge 
group, resulting in an “SU(2) x U(1)’ group structure. The new Abelian U(1) 
group is associated with a weak analogue of hypercharge — ‘weak hypercharge’ 
— just as SU(2)L was associated with ‘weak isospin’. Indeed, Glashow pro- 
posed that the Gell-Mann-Nishijima relation for charges should also hold for 
these weak analogues, giving 


eQ = e(t3 + y/2) (22.26) 


for the electric charge Q (in units of e) of the t member of a weak isomulti- 
plet, assigned a weak hypercharge y. Clearly, therefore, the lepton doublets, 
(vl,e7), etc, then have y = —1, while the quark doublets (u, d'), etc, have 
y= +3. Now, when this group is gauged, everything falls marvellously into 
place: the charged vector bosons appear as before, but there are now two 
neutral vector bosons, which between them will be responsible for the weak 
neutral current processes, and for electromagnetism. This is exactly the piece 
of mathematics we went through in section 19.6, which we now appropriate 
as an important part of the Standard Model. 

For convenience, we reproduce here the main results of section 19.6. The 
Higgs field ¢ is an SU(2) doublet 


E ot 
$= So (22.27) 
with an assumed vacuum expectation value (in unitary gauge) given by 
(oldjo) = (A (22.28) 
= v/v2 4 . 
Fluctuations about this value are parametrized in this gauge by 
$ = j (22.29) 
E > (v+ H) j 


where H is the (physical) Higgs field. The Lagrangian for the sector consisting 
of the gauge fields and the Higgs fields is 


Low = (Do (Do) + poi FOO Ej F 
where F',, is the SU(2) field strength tensor (19.80) for the gauge fields w" 


and G is the U(1) field strength tensor (19.81) for the gauge field B”, and 
Du ¿$ is given by (19.79). After symmetry breaking (i.e. the insertion of (22.29) 
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in (22.30)) the quadratic parts of (22.30) can be written in unitary gauge as 
(see problem 19.9) 


A 1 is ȘI R 
LES = 5 On HO" H — wA? 22.31) 
¡E 3 z 7 1 ae 
— ¿O Wi — ô Win) (O WY — YW!) + gov Wi Wi 22.32) 
spa aa A A 7 1 at 
= (0 Wa, — 0,Wa(O"W* — Ə WY) + ¿Io Wa, Wy 22.33) 
1 > 7 FU v 7 pa 7 7 
— gu — 3v Zu) 0" Z — PZ") + ZW +97 )Z,Z" (22,34) 
LD. 
= qh kt 22.35) 
where 3 _ _ 
Z! = cosOwWS' — sin 0w BY, 22.36) 
AY =sin0wW! + cos Ow B", 22.37) 
and A 
Ph — Gh AY — O” AF, 22.38) 
with 
cosOw = 9/(92+ 92)1/2, sinw =g9'/(92+ g2)!/2. (22.39) 


Feynman rules for the vector boson propagators (in unitary gauge) and cou- 
plings, and for the Higgs couplings, can be read off from (22.30), and are given 
in appendix Q. 

Equations (22.31)-(22.35) give the tree-level masses of the Higgs boson 
and the gauge bosons: (22.31) tells us that the mass of the Higgs boson is 


my = V2 = VAv/v2, (22.40) 


where v/V2 is the (tree-level) Higgs vacuum value; (22.32) and (22.33) show 
that the charged W’s have a mass 


My = gv/2 (22.41) 


where g is the SU(2)L gauge coupling constant; (22.34) gives the mass of the 
Z° as 
Mz = Mw/ cos Ow (22.42) 


and (22.35) shows that the A“ field describes a massless particle (to be iden- 
tified with the photon). 

Still unaccounted for are the right-handed chiral components of the fermion 
fields. There is at present no evidence for any weak interactions coupling to 
the right-handed field components, and it is therefore natural — and a basic 
assumption of the electroweak theory — that all ‘R’ components are singlets 
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TABLE 22.1 
Weak isospin and hypercharge assignments. 


t t3 y Q 
0 
0 


Var» Varo Vel 1/2 1/2 -1 


Vero Vuro Vir 0 0 0 

eL, ML, TL 1/2 -1/2 -1 -1 

ER, HR; TR 0 0 -2 -l 

UL, CL, tL 1/2 1/2 1/3 2/3 
UR, CR, tr 0 0 4/3 2/3 
dst, b 1/2 -1/2 1/3 -1/3 
dz, sh, b 0 Wet -2/3 -1/3 
pt 1/2 1⁄2 1 1l 

po 1/2 -1/2 1 0 


under the weak isospin group. Crucially, however, the ‘R? components do 
interact via the U(1) field B“; it is this that allows electromagnetism to emerge 
free of parity-violating ys; terms, as we shall see. With the help of the weak 
charge formula (equation (22.26)), we arrive at the assignments shown in table 
22.1. 

We have included ‘R’ components for the neutrinos in the table. It is, 
however, fair to say that in the original Standard Model the neutrinos were 
taken to be massless, with no neutrino mixing. We have seen in chapter 20 
that it is for many purposes an excellent approximation to treat the neutrinos 
as massless, except when discussing neutrino oscillations. We shall mention 
their masses again in section 22.5.2, but for the moment we proceed in the 
‘massless neutrinos’ approximation. In this case, there are no ‘R’ components 
for neutrinos, and no neutrino mixing. 

We can now proceed to write down the currents of the electroweak theory. 
We will show that these dynamical symmetry currents are precisely the same 
as the phenomenological currents of the current-current model developed in 
chapter 20. The new feature here is that — as in the electromagnetic case — 
the currents interact with each other by the exchange of a gauge boson, rather 
than directly. 


22.2.2 The leptonic currents (massless neutrinos): relation 
to current-current model 


We write the SU(2), xU(1) covariant derivative, in terms of the fields w" 
and B* of section 19.6, as 


DY = 0" +igr-W"/2+igyB"/2 on L' SU(2) doublets (22.43) 


and as 
DY = 0" +igyB"/2 on ‘R’ SU(2) singlets. (22.44) 
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The leptonic couplings to the gauge fields therefore arise from the ‘gauge- 
covariantized’ free leptonic Lagrangian: 


Liep= Y Ípmidla+ XO lpnidiyn, (22.45) 


f=e,p,T f=e,p,T 


where the / fL are the left-handed doublets 


În = ( K Ji (22.46) 


and lr are the singlets ip = = ép etc. 

Consider first the charged leptonic currents. The correct normalization for 
the charged fields is that WY = (W! — iW4)/V2 destroys the W+ or creates 
the W- (cf (7.15)). The ‘r - W/2’ terms can be written as 


a 1 FE E Ve 7,3072 . 
Wham O EEN ip, a 


where T4 = (Tı £i72)/2 are the usual raising and lowering operators for the 
doublets. Thus the ‘f=e’ contribution to the first term in (22.45) picks out 
the process e” — Ve + W7 for example, with the result that the corresponding 
vertex is given by 


ig _„(1-— 7s) 
— ph, 22.48 
A a (22.48) 
The ‘universality’ of the single coupling constant ‘g’ ensures that (22.48) is 
also the amplitude for the u — vu — W and 7 — v, — W vertices. Thus the 
amplitude for the v, + e” — pu + Ve process considered in section 20.8 is 


{fawn Du) 
(22.49) 


corresponding to the Feynman graph of figure 22.10. 
For k? < Mă, we can replace the W-propagator by the constant value 
gt” /My,, leading to the amplitude 


+2 
ig^ _ _ 
— 77 WH) Yl — %5)u(vu)ă(ve)y (1 — y5)ule), (22.50) 
8M3 
which may be compared with the form we used in the current-current theory, 
equation (20.50). This comparison gives 
Ge_ £ 
Va 8M3 


This is an important equation, giving the precise version, in the GSW theory, 


(22.51) 
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v H 


FIGURE 22.10 
W-exchange process inv, +e > +1. 


of the qualitative relation g?/M?, ~ Gr introduced following equation (22.20), 
and in volume 1, at equation (1.32). 
Putting together (22.41) and (22.51) we can deduce 


Gp /V2 = 1/(2v?) (22.52) 
so that from the known value (22.4) of Gp there follows the value of v: 
v ~ 246 GeV. (22.53) 
Alternatively we may quote v/v2 (the vacuum value of the Higgs field): 
v/V2 ~ 174 GeV. (22.54) 


This parameter sets the scale of electroweak symmetry breaking, but as yet 
no theory is able to predict its value. It is related to the parameters A, u of 
(22.30) by v/V2 = V2u/d1/? (cf (17.98)). 

In general, the charge-changing part of (22.45) can be written as 


EA sa UA) a rc aE 205) a i 
aerisit pipi a 
+hermitian conjugate, (22.55) 


where W* = (Wi — iW#)/\/2. (22.55) has the form 
— IE Mleptons)W,, — jët (leptons) W} (22.56) 


where the leptonic weak charged current JEg (leptons) is precisely that used in 
the current-current model (equation (20.38)), up to the usual factors of g’s and 
V2's. Thus the dynamical symmetry currents of the SU(2), gauge theory are 
exactly the ‘phenomenological’ currents of the earlier current-current model. 
The Feynman rules for the lepton-W couplings (appendix Q) can be read off 
from (22.55). 
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Turning now to the leptonic weak neutral current, this will appear via the 
couplings to the Z°, written as 


— Ko (leptons) Z,,. (22.57) 


Referring to (22.36) for the linear combination of W# and B which represents 
Zr, we find (problem 22.4) 


îNo(leptons) = me E a 2) — sin? AQ) di, (22.58) 


where the sum is over the six lepton fields %,e7,v,,...7 . For the Q = 0 
neutrinos with t3 = +3, 


^u i _ g x (1-7). 
JNc (neutrinos) = Decode Ze dy a, (22.59) 


where now l = e, 1,7. For the other (negatively charged) leptons, we shall 
have both L and R couplings from (22.58), and we can write 


1 7 
jEo (charged leptons) = wos DE bye e (+ = +c ( = *)| l, 
2 
(22.60) 
where 
1 

d = th—sin? @wQ:= Eu sin? Ow (22.61) 
ck = —sin?0wQ) = sin? Ow. (22.62) 


As noted earlier, the Z? coupling is not pure “V-A”. These relations (22.59)- 
(22.62) are exactly the ones given earlier, in (20.85)-(20.87); in particular, 
the couplings are independent of ‘l’ and hence exhibit lepton universality. 
The alternative notation 


ÎNc(charged leptons) = Icos A a ay — ghy )Î (22.63) 


is often used, where 


= a + 2sin? Aw = => independent of l. (22.64) 
Note that the gl vanishes for sin? 0w = 0.25. Again, the Feynman rules for 
lepton-Z couplings (appendix Q) are contained in (22.59) and (22.60). 

As in the case of W-mediated charge-charging processes, Z°-mediated pro- 
cesses reduce to the current-current form at low k?. For example, the ampli- 
tude for eu” — ep via Z? exchange (figure 22.11) reduces to 

ig? 


“rm Ol) + ea 


x [E0 — 95) + ea (1 +75) ]u(u). (22.65) 
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FIGURE 22.11 
Z°-exchange process in e pu =>e "pu" . 


It is customary to define the parameter 
p = My / (Mé cos? Ow), (22.66) 


which is unity at tree-level, in the absence of loop corrections. The ratio of 
factors in front of the @...u expressions in (22.65) and (22.50) (i.e. ‘neutral 
current process” / “charged current process”) is then 2p. 

We may also check the electromagnetic current in the theory, by looking 
for the piece that couples to A". We find 


cae = =g sin Ow 5 îi (22.67) 


l=e,p,7 
which allows us to identify the electromagnetic charge e as 
e = gsin 0w (22.68) 
as already suggested in (19.97) of chapter 19. Note that all the y5's cancel 


from (22.67), as is of course required. 


22.2.3 The quark currents 


The charge-changing quark currents, which are coupled to the WF fields, have 
a form very similar to that of the charged leptonic currents, except that the 
t3 = -4 components of the L-doublets have to be understood as the flavour- 
mixed (weakly interacting) states 


d Vud Vus Vab d 
că =| Vea Ves Ve $ , (22.69) 
v ji Via Vis Vib b Ja 


where d, 3 and b are the strongly interacting fields with masses ma, ms and 
mp, and the V-matrix is the CKM matrix used extensively in chapter 21. We 
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shall discuss this matrix further in section 22.5.2. Thus the charge-changing 
weak quark current is 


5 Gf O 05) a A a 
jeéc (quarks) = Va (inte + 27 rac: ae + ee , 


2 
(22.70) 
which generalizes (20.90) to three generations and supplies the factor g/v2, 
as for the leptons. 

The neutral currents are diagonal in flavour if the matrix V is unitary (see 
also section 22.5.2). Thus on (quarks) will be given by the same expression 
as (20.103), except that now the sum will be over all six quark flavours. The 
neutral weak quark current is thus 


A g = (1 — 75) (Lys): | 4 
jc (quarks) = cs 2 ĝa" q += Ela (2271) 
where 
ci = ti- sin?’ wQ (22.72) 
ch = —sin20wQa. (22.73) 


These expressions are exactly as given in (20.103)-(20.105). As for the charged 
leptons, we can alternatively write (22.71) as 


2 g = x 
jNc(quarks) = T 2 d (9% — gA 75)Â, (22.74) 
where 
gi = th—2sin? OwQ, (22.75) 
ga = t (22.76) 


Before proceeding to discuss some simple phenomenological consequences, 
we remind the reader of one important feature of the Standard Model currents 
in general. Reading (22.24) and (22.25) together ‘vertically’, the leptons and 
quarks are grouped in three generations, each with two leptons and two quarks. 
The theoretical motivation for such family grouping is that anomalies are 
cancelled within each complete generation, as discussed in section 18.4. 


E QgOOIKEK$>A AA—<«<«<á/ Tj.jjíj. ooo o ————— 


22.3 Simple (tree-level) predictions 


The theory as so far developed has just 4 parameters: the gauge couplings y 
and g', and the parameters A and u of the Higgs potential. The previous two 
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subsections show that all the couplings to fermions can be written in terms 
of the known quantities Gp and e (or a), and one free parameter which may 
be taken to be sin 0w. We noted in section 20.9 that, before the discovery of 
the W and Z particles, the then known neutrino data were consistent with a 
single value of Ow given by sin” Ow = 0.23. Using (22.51) and (22.68), it was 
then possible to predict the value of Mw: 


1/2 

1 2 

Mw = (=) LL Gov 77.73 GeV. (22.77) 
J/2Gr sin?w sinw 


Similarly, using (22.42) we predict 
Mz = Mw/ cos Ow ~ 88.58 GeV. (22.78) 


These predictions of the theory (at lowest order) indicate the power of the 
underlying symmetry to tie together many apparently unrelated quantities, 
which are all determined in terms of only a few basic parameters. We now 
present a number of other simple tree-level predictions. 
The width for WT > e” + e can be calculated using the vertex (22.48), 
with the result (problem 22.5) 
2 
ii SSeS ie Ge Mu ~ 205 MeV, (22.79) 
using (22.77). The widths to pu Du, T Dr, are the same. Neglecting CKM 
flavour mixing among the two energetically allowed quark channels tid and cs, 
their widths would also be the same, apart from a factor of 3 for the different 
colour channels. The total W width for all these channels will therefore be 
about nine times the value in (22.79), i.e. 1.85 GeV, while the branching ratio 
for W — ev is 
B(ev) = T(W > ev)/T(total) = 11%. (22.80) 


In making these estimates we have neglected all fermion masses. 
The width for ZO — vi can be found from (22.79) by replacing g/21/2 by 
g/2cosOw, and Mw by Mz, giving 


_ 1g? Mz Gp M. 
[(Z° > vo) = 5 —— - Ge ME 159 Mev, 22.81 

ama ot Ala Or ă veel) 
using (22.78). Charged lepton pairs couple with both c}, and ch terms, leading 
(with neglect of lepton masses) to 


1 ¡2 l ¡2 2 
z + M 
(Ze il) (+ = [cp | ) g Z 


4r cos? Ow 


(22.82) 


The values c¥ = 3, cg = 0 in (22.82) reproduce (22.81). With sin? Ow = 0.23, 
we find E 
T(Z° > 11) ~ 76.5 MeV. (22.83) 
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FIGURE 22.12 
Neutrino-electron graphs involving Z° exchange. 


Quark pairs couple as in (22.71), the GIM mechanism ensuring that all flavour- 
changing terms cancel. The total width to ut,dd,c¢,s8 and bb channels 
(allowing 3 for colour and neglecting masses) is then 1538 MeV, produc- 
ing an estimated total width of approximately 2.22 GeV. (QCD corrections 
will increase these estimates by a factor of order 1.1). The branching ratio 
to charged leptons is approximately 3.4%, to the three (invisible) neutrino 
channels 20.5%, and to hadrons (via hadronization of the qq channels) about 
69.3%. In section 22.4.3 we shall see how a precise measurement of the total 
Z° width at LEP determined the number of light neutrinos to be 3. 

Cross sections for neutrino-lepton scattering proceeding via ZO exchange 
can be calculated (for k? < M2) using the currents (22.59) and (22.60), and 
the method of section 20.5. Examples are 


Ve. —> We” (22.84) 


and 
Dye — Dye (22.85) 


as shown in figure 22.12. Since the neutral current for the electron is not pure 
V-A, as was the charged current, we expect to see terms involving both |c} |? 
and [ch |?, and possibly an interference term. The cross section for (22.84) is 
found to be (t Hooft 1971c) 


do/dy = (208 Bme/7)lel + kP — 9)? — 5 (che, + e d)yme/El, 
(22.86) 
where E is the energy of the incident neutrino in the ‘laboratory’ system, and 
y = (E — E’)/E as before, where E” is the energy of the outgoing neutrino in 
the ‘laboratory’ system”. Equation (22.86) may be compared with the ve” > 
juve (charged current) cross section of (20.84) by noting that t = —2m.Ey: 
the |c} |? term agrees with the pure V-A result (20.84), while the [ch |? term 


2In the kinematics, lepton masses have been neglected wherever possible. 
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FIGURE 22.13 
One-W annihilation graph in dee” > Pee”. 


involves the same (1 — y)? factor discussed for vq scattering in section 20.7.2. 
The interference term is negligible for E > me. The cross section for the 
antineutrino process (22.85) is found from (22.86) by interchanging cf, and 
l 
CR- 

A third neutrino-lepton process is experimentally available, 


D¿e > Hee, (22.87) 


the cross section for which was measured by Reines, Gurr and Sobel (1976), 
using electron antineutrinos from an 1800-MW fission reactor at Savannah 
River. In this case there is a single W intermediate state graph, shown in 
figure 22.13, to consider as well as the Z° one; the latter is similar to the right- 
hand graph in figure 22.12, but with 7, replaced by Ze. The cross section for 
(22.87) turns out to be given by an expression of the form (22.86), but with 
the replacements 


1 
ci, => iu sin? Ow, ck > sin? Ow. (22.88) 


Reines, Gurr and Sobel reported the result sin? Oy = 0.29 + 0.05. 

We emphasize once more that all these cross sections are determined in 
terms of Gp, a and only one further parameter, sin? 8w. As mentioned in 
section 20.9, experimental fits to these predictions are reviewed by Commins 
and Bucksbaum (1983), Renton (1990) and Winter (2000). 

Particularly precise determinations of the Standard Model parameters 
were made at the ete” colliders, LEP and SLC. Consider the reaction ete” => 
ff where f is pi or 7, at energies where the lepton masses may be neglected in 
the final answers. In lowest order, the process is mediated by both y-exchange 
and Z°-exchange as shown in figure 22.14. Calculations of the cross section 
were made early on, by Budny (1973) for example. In modern notation, the 
differential cross section for the scattering of unpolarized e” and et is given 
by 

do ra? 


— = — [1 29A 0B 22. 
> Ci [(1 + cos“ 0) A + cos 6B] (22.89) 
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(a) (b) 


FIGURE 22.14 i 
(a) One-y and (b) one-W annihilation graphs in ete” > ff. 


where @ is the CM scattering angle of the final state lepton, s = (pe- + pe+), 
and 


= 14 29% g/Rex(s) + [(9h)? + (992109? + (94,7 I1x(8)? (22.90) 
= 4g ghRex(s) + 894.9994 941x(8)I? (22.91) 
x(s) = 8/[4sin? Aw cos” Ow(s — Mz + il'zMz))- (22.92) 


Notice that the term surviving when all the g’s are set to zero, which is there- 
fore the pure single photon contribution, is exactly as calculated in problem 
8.18. The presence of the cos term leads to the forward—backward asymme- 
try noted in that problem. 

The forward—backward asymmetry Arp may be defined as 


App = (Nf = NB)/(Ne + NB), (22.93) 


where Nk is the number scattered into the forward hemisphere 0 < cos@ < 1, 
and Np that into the backward hemisphere —1 < cos@ < 0. Integrating 
(22.89) one easily finds 

App = 3B/8A. (22.94) 


For sin? Ow = 0.25 we noted after (22.64) that the gi,’s vanish, so they are 
very small for sin? 0w œ 0.23. The effect is therefore controlled essentially by 
the first term in (22.91). At ys = 29 GeV, for example, the asymmetry is 
App = —0.063. 

This asymmetry was observed in experiments with PETRA at DESY and 
with PEP at SLAC (see figure 8.20(b)). These measurements, made at en- 
ergies well below the Z° peak, were the first indication of the presence of Z° 
exchange in ete” collisions. 
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However, QED alone produces a small positive App, through interference 
between 1y and 2y annihilation processes (which have different charge conju- 
gation parity), as well as between initial and final state bremsstrahlung cor- 
rections to figure 22.14(a). Indeed, all one-loop radiative effects must clearly 
be considered, in any comparison with modern high precision data. 

At the CERN ete” collider LEP, many such measurements were made 
‘on the Z peak’, i.e. at s = MZ in the parametrization (22.92). In that case, 
Rey(s) = 0, and (22.94) becomes (neglecting the photon contribution) 


39 99 9A 
ArB(Z peak) = = AIVAN 22.95 
BD peak) = Tee aoe Plo)? + GED ae 


Another important asymmetry observable is that involving the difference 
of the cross sections for left- and right-handed incident electrons: 


ALR = (op — oR)/(oL + oR), (22.96) 
for which the tree-level prediction is 
Arr = 29Y94/l(9v) + (94). (22.97) 


A similar combination of the g’s for the final state leptons can be measured 
by forming the ‘L-R F-B” asymmetry 


AFR = [(oLr — 018) — (orr — oRB)|/(oR + oL) (22.98) 


for which the tree level prediction is 


AER = 29894 UE)? + (992. (22.99) 
The quantity on the right-hand side of (22.99) is usually denoted by Ay: 
Ar = 29h gh /[(9¥)? + (94). (22.100) 


The asymmetry Arg is not, in fact, direct evidence for parity violation in 
ete” — utu”, since we see from (22.90) and (22.91) that it is even under 
gi > —gh,, whereas a true parity-violating effect would involve terms odd 
(linear) in gi. However, electroweak-induced parity violation effects in an 
apparently electromagnetic process were observed in a remarkable experiment 
by Prescott et al. (1978). Longitudinally polarized electrons were inelastically 
scattered from deuterium, and the flux of scattered electrons was measured 
for incident electrons of definite helicity. An asymmetry between the results, 
depending on the helicities, was observed — a clear signal for parity violation. 
This was the first demonstration of parity-violating effects in an ‘electromag- 
netic’ process; the corresponding value of sin? Oy was in agreement with that 
determined from v data. 

We now turn to some of the main experimental evidence, beginning with 
the discoveries of the W* and ZO 1983. 
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FIGURE 22.15 
Parton model amplitude for W* or ZO production in pp collisions. 
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22.4 The discovery of the W+ and ZO at the CERN pp 
collider 


22.4.1 Production cross sections for W and Z in pp colliders 


The possibility of producing the predicted WË and ZO particles was the prin- 
cipal motivation for transforming the CERN SPS into a pp collider using the 
stochastic cooling technique (Rubbia et al. 1977, Staff of the CERN pp project 
1981). Estimates of W and ZO production in pp collisions may be obtained 
(see, for example, Quigg 1977) from the parton model, in a way analogous to 
that used for the Drell-Yan process in section 9.4 with y replaced by W or 
Z°, as shown in figure 22.15 (cf figure 9.11), and for two-jet cross sections in 
section 14.3.2. As in (14.51), we denote by $ the subprocess invariant 


§ = (z1pı + t2p2)” = 21228 (22.101) 


for massless partons. With 31/2 = Mw ~ 80 GeV, and s1/2 =630 GeV for 
the pp collider energy, we see that the z's are typically ~0.13, so that the 
valence q’s in the proton and G's in the antiproton will dominate (at ys = 1.8 
TeV, appropriate to the Fermilab Tevatron, z ~ 0.04 and the sea quarks 
contribute). The parton model cross section pp > W++ anything is then 
(setting Via = 1 and all other V;; = 0) 


sowe+o=l Par | dmole,, m) | Heder) + derula) 
o(pp > WF +X) = sf d if dx26(x1, 2 mee) nate) | 


where the 3 is the same colour factor as in the Drell-Yan process, and the 


subprocess cross section 6 for qq > W= + X is (neglecting the W+ width) 


ô 4r°a(1/4sin? Ow)5(§ — M¥,) (22.103) 
m2*/? Gp M25 (21228 — Me). (22.104) 
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QCD corrections to (22.102) must as usual be included. Leading loga- 
rithms will make the distributions Q?-dependent, and they should be evalu- 
ated at Q? = Mă. There will be further (O(a2)) corrections, which are often 
accounted for by a multiplicative factor ‘K’, which is of order 1.5-2 at these 
energies. O(a?) calculations are presented in Hamberg et al. (1991) and by 
van der Neerven and Zijlstra (1992); see also Ellis et al. (1996) section 9.4. 
The total cross section for production of W* and W- at ys =630 GeV is 
then of order 6.5 nb, while a similar calculation for the Z° gives about 2 nb. 
Multiplying these by the branching ratios gives 


o(pp +> W+X>evX) = 0.7 nb (22.105) 


o(pp > Z° +X > ete-X) 


K 


0.07 nb (22.106) 


at V5 =630 GeV. 


The total cross section for pp is about 70 mb at these energies: hence 
(22.105) represents ~ 1078 of the total cross section, and (22.106) is 10 times 
smaller. The rates could, of course, be increased by using the qq modes of 
W and ZO, which have bigger branching ratios. But the detection of these is 
very difficult, being very hard to distinguish from conventional two-jet events 
produced via the mechanism discussed in section 14.3.2, which has a cross 
section some 10% higher than (22.105). W and Z° would appear as slight 
shoulders on the edge of a very steeply falling invariant mass distribution, 
similar to that shown in figure 9.12, and the calorimetric jet energy resolution 
capable of resolving such an effect is hard to achieve. Thus despite the un- 
favourable branching ratios, the leptonic modes provide the better signatures, 
as discussed further in section 22.4.3. 


22.4.2 Charge asymmetry in W~ decay 


At energies such that the simple valence quark picture of (22.102) is valid, the 
W is created in the annihilation of a left-handed u quark from the proton 
and a right-handed d quark from the p (neglecting fermion masses). In the 
WF > ety, decay, a right-handed e* and left-handed 1% are emitted. Refer- 
ring to figure 22.16, we see that angular momentum conservation allows e* 
production parallel to the direction of the antiproton, but forbids it parallel 
to the direction of the proton. Similarly, in WT — ee, the e” is emitted 
preferentially parallel to the proton (these considerations are exactly similar 
to those mentioned in section 20.7.2 with reference to vq and vq scattering). 
The actual distribution has the form ~ (1 + cos 6*)?, where 0; is the angle, in 
the rest frame of the W, between the e” and the p (for WT — e x) or the 
et and the p (for Wt => et). 
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FIGURE 22.16 
Preferred direction of leptons in WT decay. 


22.4.3 Discovery of the W* and Z° at the pp collider, and 
their properties 


As already indicated in section 22.4.1, the best signatures for W and Z pro- 
duction in pp collisions are provided by the leptonic modes 


pp > WFX > e*vX (22.107) 
pp — Z°X + ete. (22.108) 


Reaction (22.107) has the larger cross section, by a factor of 10 (cf (22.105) and 
(22.106)), and was observed first (UAI, Arnison et al. 1983a; UA2, Banner 
et al. 1983). However, the kinematics of (22.108) is simpler and so the ZO 
discovery (UAI, Arnison et al. (1983b); UA2, Bagnaia et al. 1983) will be 
discussed first. 

The signature for (22.108) is an isolated, and approximately back-to-back, 
ete” pair with invariant mass peaked around 90 GeV (cf (22.78)). Very clean 
events can be isolated by imposing a modest transverse energy cut — the ete 
pairs required are coming from the decay of a massive relatively slowly moving 
ZO. Figure 22.17 shows the transverse energy distribution of a candidate ZO 
event from the first UA2 sample. Figure 22.18 shows (Geer 1986) the invariant 
mass distribution for a later sample of 14 UA1 events in which both electrons 
have well measured energies, together with the Breit-Wigner resonance curve 
appropriate to Mz = 93 GeV/c’, with experimental mass resolution folded 
in. The UAI result for the ZO mass was 


Mz = 93.0 + 1.4(stat) + 3.2(syst.) GeV. (22.109) 


The corresponding UA2 result (DiLella 1986), based on 13 well measured 
pairs, was 

Mz = 92.5 + 1.3(stat.) + 1.5(syst.) GeV. (22.110) 
In both cases the systematic error reflects the uncertainty in the absolute 
calibration of the calorimeter energy scale. Clearly the agreement with (22.78) 
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FIGURE 22.17 
The cell transverse energy distribution for a ZO — ete event (UA2, Bagnaia 


et al. 1983) in the 9 and ¢ plane, where 6 and ¢ are the polar and azimuth 
angles relative to the beam axis. 
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FIGURE 22.18 

Invariant mass distribution for 14 well measured Z° — ete” decays (UA1). 
Figure reprinted with permission from S Geer in High Energy Physics 1985, 
Proc. Yale Theoretical Advanced Study Institute, eds M J Bowick and F 
Gursey; copyright 1986 World Scientific Publishing Company. 


22.4. The discovery of the W* and 70 at the CERN pp collider 397 


is good, but there is a suggestion that the tree-level prediction is on the low 
side. Indeed, loop corrections adjust (22.78) to a value Mih = 91.19 GeV, 
in excellent agreement with the current experimental value (Nakamura et al. 
2010). 

The total Z? width Tz is an interesting quantity. If we assume that, for 
any fermion family additional to the three known ones, only the neutrinos are 
significantly less massive than Mz/2, we have 


Tz = (2.5 + 0.16AN,) GeV (22.111) 


from section 22.3, where AN, is the number of additional light neutrinos 
(i.e. beyond ve, Y, and v+) which contribute to the width through the process 
ZO — vv. Thus (22.111) can be used as an important measure of the number 
of such neutrinos (i.e. generations) if Ty can be determined accurately enough. 
The mass resolution of the pp experiments was of the same order as the total 
expected Z° width, so that (22.111) could not be used directly. The advent 
of LEP provided precision checks on (22.111); at the cost of departing from 
the historical development, we show data from DELPHI (Abreu et al. 1990, 
Abe 1991) in figure 22.19, which established N, = 3. 

We turn now to the WS. In this case an invariant mass plot is impossi- 
ble, since we are looking for the ev (uv) mode, and cannot measure the v’s. 
However, it is clear that — as in the case of Z? — ete” decay — slow moving 
massive W’s will emit isolated electrons with high transverse energy. Further, 
such electrons should be produced in association with large missing transverse 
energy (corresponding to the v’s), which can be measured by calorimetry, and 
which should balance the transverse energy of the electrons. Thus electrons of 
high Er accompanied by balancing high missing Er (i.e. similar in magnitude 
to that of the e~ but opposite in azimuth) were the signatures used for the 
early event samples (UAI, Arnison et al. 1983a; UA2, Banner et al. 1983). 

The determination of the mass of the W is not quite so straightforward as 
that of the Z, since we cannot construct directly an invariant mass plot for the 
ev pair: only the missing transverse momentum (or energy) can be attributed 
to the v, since some unidentified longitudinal momentum will always be lost 
down the beam pipe. In fact, the distribution of events in per, the magnitude 
of the transverse momentum of the e~, should show a pronounced peaking 
towards the maximum kinematically allowed value, which is pep = + Mw, as 
may be seen from the following argument. Consider the decay of a W at rest 
(figure 22.20). We have |p,| = Mw and |per| = ¿Mw sin@ = per. Thus the 
transverse momentum distribution is given by 


do 2per 1 2 2 one 
= = — 22.112 
dcos 0 (3) G WET » l ) 


and the last (Jacobian) factor in (22.112) produces a strong peaking towards 
PeT = Mw. This peaking will be smeared by the width, and transverse 
motion, of the W. Early determinations of Mw used (22.112), but sensitivity 


do _ do _ |dcosé 
dper dcos | dper 
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FIGURE 22.19 


The cross-section for ete” — hadrons around the ZO mass (DELPHI, 1990). 
The dotted, continuous and dashed lines are the predictions of the Standard 
Model assuming two, three and four massless neutrino species respectively. 
Figure reprinted with permission from K Abe in Proc. 25th Int. Conf. on 


High Energy Physics eds K K Phua and Y Yamaguchi; copyright 1991 World 
Scientific Publishing Company. 
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FIGURE 22.20 
Kinematics of W — ev decay. 
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FIGURE 22.21 

W > ev transverse mass distribution measured by the CDF collaboration. 
Figure reprinted with permission from F Abe et al. (CDF Collaboration) 
Phys. Rev. D 52 4784 (1995). Copyright 1995 by the American Physical 
Society. 


to the transverse momentum of the W can be much reduced (Barger et al. 
1983) by considering instead the distribution in ‘transverse mass’, defined by 


Ma = (Eer4 Er) (Per + Pur) x 2perpur(1 — cos 6), (22.113) 


where q is the azimuthal separation between per and pur. Here Eyr and pr 
are the neutrino transverse energy and momentum, measured from the missing 
transverse energy and momentum obtained from the global event reconstruc- 
tion. This inclusion of additional measured quantities improves the precision 
as compared with the Jacobian peak method, using (22.112). A Monte Carlo 
simulation was used to generate Mr distributions for different values of Mw, 
and the most probable value was found by a maximum likelihood fit. The 
quoted results were 


UAI (Geer 1986): My = 83.5 +!) (stat.) + 2.8(syst.) GeV (22.114) 
UA? (DiLella 1986): Mw = 81.2+1.1(stat.) + 1.3(syst.) GeV (22.115) 


the systematic errors again reflecting uncertainty in the absolute energy scale 
of the calorimeters. The two experiments also quoted (Geer 1986, DiLella 
1986) 

UAI [Tw < 6.5 GeV 


UA2 Lwe70GeV poor c.l. (22.116) 


Once again, the agreement between the experiments, and of both with (22.77), 
is good, the predictions again being on the low side. Loop corrections adjust 
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FIGURE 22.22 

The W decay angular distribution of the emission angle 0% of the positron 
(electron) with respect to the antiproton (proton) beam direction, in the rest 
frame of the W, for a total of 75 events; background subtracted and acceptance 
corrected (Arnison et al. 1986). 


(22.77) to Mw = 80.38 GeV (Nakamura et al. 2010). We show in figure 22.21 
a later determination of Mw by the CDF collaboration (Abe et al. 1995a). 
The W and Z mass values may be used together with (22.42) to obtain 
sin? Ow via 
sin? Ow = 1 — Múy/Mé. (22.117) 


The weighted average of UA(1) and UA(2) yielded 


sin? Oy = 0.212 + 0.022 (stat.). (22.118) 


Radiative corrections have in general to be applied, but one renormalization 
scheme (see section 22.6) promotes (22.117) to a definition of the renormalized 
sin? Ow to all orders in perturbation theory. Using this scheme and quoted 
values of Mw, Mz (Nakamura et al. 2010) one finds sin? 0w ~ 0.223. 

Finally, figure 22.22 shows (Arnison et al. 1986) the angular distribution of 
the charged lepton in W > ev decay (see section 22.4.2); 0 is the et (e) angle 
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in the W rest frame, measured with respect to a direction parallel (antiparallel) 
to the p(p) beam. The expected form (1 + cos 6%)? is followed very closely. 
In summary, we may say that the early discovery experiments provided 
remarkably convincing confirmation of the principal expectations of the GSW 
theory, as outlined in section 22.3. 
We now consider some further aspects of the theory. 


| 
22.5 Fermion masses 
22.5.1 One generation 


The fact that the SU(2)L gauge group acts only on the L components of the 
fermion fields immediately appears to create a fundamental problem as far 
as the masses of these particles are concerned; we mentioned this briefly at 
the end of section 19.6. Let us recall first that the standard way to introduce 
the interactions of gauge fields with matter fields (e.g. fermions) is via the 
covariant derivative replacement 


Ət > D! = ð + igr -W"/2 (22.119) 


for SU(2) fields W” acting on t = 1/2 doublets. Now it is a simple exercise 
(compare problem 18.3) to check that the ordinary ‘kinetic’ part of a free 
Dirac fermion does not mix the L and R components of the field: 


Ý Ob = dy Dir + by Pun. (22.120) 


Thus we can in principle contemplate ‘gauging’ the L and the R components 
differently. Of course, in the case of QCD (cf (18.39)) the replacement Y PD 
was made equally in each term on the right-hand side of (22.120). But this was 
because QCD conserves parity, and must therefore treat L and R components 
the same. Weak interactions are parity violating, and the SU(2)L covariant 
derivative acts only in the second term of (22.120). On the other hand, a 
Dirac mass term has the form 


mb br + debe) (22.121) 


(see equation (18.41) for example), and it precisely couples the L and R com- 
ponents. It is easy to see that if only wy is subject to a transformation, then 
(22.121) is not invariant. Thus mass terms for Dirac fermions will explicitly 
break SU(2)L. The same is also true for Majorana fermions which might 
describe the neutrinos. 

This kind of explicit breaking of the gauge symmetry cannot be tolerated, 
in the sense that it will lead, once again, to violations of unitarity, and then of 
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FIGURE 22.23 f 
One-Z° and one-y annihilation contribution to f1=-1f1=1 > Wd Wo - 


renormalizability. Consider, for example, a fermion-antifermion annihilation 
process of the form a 
ff => WEW, (22.122) 


where the subscript indicates the A = 0 (longitudinal) polarization state of the 
W=. We studied such a reaction in section 22.1.1 in the context of unitarity vi- 
olations (in lowest-order perturbation theory) for the IVB model. Appelquist 
and Chanowitz (1987) considered first the case in which ‘f’ is a lepton with 
t = 4,t3 = —) coupling to W’s, ZO and y with the usual SU(2) x U(1) 
couplings, but having an explicit (Dirac) mass my. They found that in the 
‘right’ helicity channels for the leptons (A = +1 for f,\ = —1 for f) the 
bad high energy behaviour associated with a fermion-exchange diagram of 
the form of figure 22.4 was cancelled by that of the diagrams shown in figure 
22.23. The sum of the amplitudes tends to a constant as s (or E?) > oo. 
Such cancellations are a feature of gauge theories, as we indicated at the end 
of section 22.1.2, and represent one aspect of the renormalizability of the the- 
ory. But suppose, following Appelquist and Chanowitz (1987), we examine 
channels involving the ‘wrong’ helicity component, for example A = +1 for 
the fermion f. Then it is found that the cancellation no longer occurs, and we 
shall ultimately have a ‘non-renormalizable’ problem on our hands, all over 
again. 

An estimate of the energy at which this will happen can be made by 
recalling that the ‘wrong’ helicity state participates only by virtue of a factor 
(my/energy) (recall section 20.2.2), which here we can take to be my/,/s. 
The typical bad high energy behaviour for an amplitude M was M ~ Gps, 
which we expect to be modified here to 


The estimate obtained by Appelquist and Chanowitz differs only by a factor of 
V2. Attending to all the factors in the partial wave expansion gives the result 
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that the unitarity bound will be saturated at E = Ey (TeV) ~ a/my (TeV). 
Thus for m, ~ 175 GeV, E, ~ 18 TeV. This would constitute a serious flaw 
in the theory, even though the breakdown occurs at energies beyond those 
currently reachable. 

However, in a theory with spontaneous symmetry breaking, there is a way 
of giving fermion masses without introducing an explicit mass term in the 
Lagrangian. Consider the electron, for example, and let us hypothesize a 
‘Yukawa’—type coupling between the electron-type SU(2) doublet 


l= ( ss ) , (22.124) 
L 


the Higgs doublet , and the R-component of the electron field: 


Luc = ~ge (ler der + rG ler). (22.125) 


In each term of (22.125), the two SU(2)1 doublets are ‘dotted together’ so as 
to form an SU(2) scalar, which multiplies the SU(2)L scalar R-component. 
Thus (22.125) is SU(2)L-invariant, and the symmetry is preserved, at the 
Lagrangian level, by such a term. But now insert just the vacuum value 
(22.28) of $ into (22.125): we find the result 

v 


Lux (vac) = — ge grên + ener) (22.126) 


which is exactly a (Dirac) mass of the form (22.121), allowing us to make the 
identification 
Me = gev/V2. (22.127) 


When oscillations about the vacuum value are considered via the replace- 
ment (22.29), the term (22.125) will generate a coupling between the electron 
and the Higgs fields of the form 


—geeéH/V2 = —(m./v)éeH (22.128) 
= —~(gm./2Mw)ééH. (22.129) 


The presence of such a coupling, if present for the process ff => we Wo 
considered earlier, will mean that, in addition to the f-exchange graph analo- 
gous to figure 22.4 and the annihilation graphs of figure 22.23, a further graph 
shown in figure 22.24, must be included. The presence of the fermion mass in 
the coupling to H suggests that this graph might be just what is required to 
cancel the ‘bad’ high energy behaviour found in (22.123) — and by this time 
the reader will not be surprised to be told that this is indeed the case. 

At first sight it might seem that this stratagem will only work for the 


tą = —4 components of doublets, because of the form of (0|d|0). But we 
learned in section 12.1.3 that if a pair of states ( ) forming an SU(2) 
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Wo 


FIGURE 22.24 
One-H annihilation graph. 


doublet transform by 


( i | — iar /2 ( i ) | (22.130) 


then the charge conjugate states it» ( îi ) transform in exactly the same 
way. Thus if, in our case, ¢ is the SU(2) doublet 
1? +3 y_ 3 
, Z(t — ip2) = oF 
ies yal S 2) a alu (22.131) 
za (03 — ida) = 0 
then the charge conjugate field 
7 A (fs + ids) Ra 
joxine S| Mel A (22.132) 
— (ou + ide) —¢ 


is also an SU(2) doublet, transforming in just the same way as $. ((22.131) 
and (22.132) may be thought of as analogous to the (K+, K?) and (K°, K7) 
isospin doublets in SU(3)). Note that the vacuum value (22.28) will now 
appear in the upper component of (22.132). With the help of dc we can write 
down another SU(2)-invariant coupling in the ve — e sector, namely 


-gv (lendover + Perobler), (22.133) 


assuming now the existence of the field fer. In the Higgs vacuum (22.28), 
(22.133) then yields 


—(gv.0/ V2) (Beber + DerDeL) (22.134) 


which is precisely a (Dirac) mass for the neutrino, if we set g,,v/V2 = m,. 
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It is clearly possible to go on like this, and arrange for all the fermions, 
quarks as well as leptons, to acquire a mass by the same ‘mechanism’. We will 
look more closely at the quarks in the next section. But one must admit to 
a certain uneasiness concerning the enormous difference in magnitudes repre- 
sented by the couplings gy,,..-ge,--- ge- If mp, < 1eV then gy, < 10-11, while 
gt ~ 1! Besides, whereas the use of the Higgs field ‘mechanism’ in the W-Z 
sector is quite economical, in the present case it seems rather unsatisfactory 
simply to postulate a different ‘g’ for each fermion—Higgs interaction. This 
does appear to indicate that we are dealing here with a ‘phenomenological 
model’, once more, rather than a ‘theory’. 

As far as the neutrinos are concerned, however, there is another possibility, 
already discussed in sections 7.5.2, 20.3 and 21.4.1, which is that they could 
be Majorana (not Dirac) fermions. In this case, rather than the four degrees 
of freedom (VeL, Ver, and their antiparticles) which exist for massive Dirac 
particles, only two possibilities exist for neutrinos, which we may take to be 
VeL and ver. With these, it is certainly possible to construct a Dirac-type 
mass term of the form (22.134). But since, after all, the ver component has 
been assigned zero quantum members both for SU(2), W-interactions and for 
U(1) B-interactions (see table 22.1), we could consider economically dropping 
it altogether, making do with just the ver component. 

Suppose, then, that we keep only the field Dep. We need to form a mass 
term for it. The charge-conjugate field is defined by (see (7.151)) 

(Bere = iba = Inot? (22.135) 
and we know that the charge-conjugate field transforms under Lorentz trans- 
formations in the same way as the original field. So we can use (feL)c to form 
a Lorentz invariant 

(CAnre VeL (22.136) 


which has mass dimension M3. Hence we may write a mass term for DeL in 


the form i 
~ 9m ler )c DeL + DeL(VeL)a] (22.137) 


where the 4 is conventional. Written out in more detail, we have 


(De)e De = fal-i 10) Pet = PLi Y robe, (22.138) 


in the representation (20.14). Now 
2.22 —102 0 
iy yo = 0 | A (22.139) 


But since DeL is an L-chiral field, only its 2 lower components are present (cf 
(20.26)) and (22.138) is effectively 


(ZeL)e De = SL (102 )DeL. (22.140) 
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This is just the form of the mass term for a Majorana field, as we saw in 
equation (7.159). The two formalisms are equivalent. 

As noted in section 21.4.1, the mass term (22.137) is not invariant under 
a global U(1) phase transformation 


Da, > e Da, (22.141) 


which would correspond to lepton number (if accompanied by a similar trans- 
formation for the electron fields): the Majorana mass term violates lepton 
number conservation. 

There is a further interesting aspect to (22.140) which is that, since two 
DeL Operators appear rather than a îe and a di (which would lead to Le 
conservation), the (t, t3) quantum numbers of the term are (1,1). This means 
that we cannot form an SU(2)L invariant with it, using only the Standard 
Model Higgs Q, since the latter has t = 4 and cannot combine with the (1,1) 
operator to form a singlet. Thus we cannot make a ‘tree-level’ Majorana 
mass by the mechanism of Yukawa coupling to the Higgs field, followed by 
symmetry breaking. 

However, we could generate suitable ‘effective’ operators via loop correc- 
tions, perhaps, much as we generated an effective operator representing an 
anomalous magnetic moment interaction in QED (cf section 11.7). But what- 
ever it is, the operator would have to violate lepton number conservation, 
which is actually conserved by all the Standard Model interactions. Thus 
such an effective operator could not be generated in perturbation theory. It 
could arise, however, as a low energy limit of a theory defined at a higher 
mass scale, as the current-current model is the low energy limit of the GSW 
one. The typical form of such operator we need, in order to generate a term 
Dl i02DeL, is 

= a (inpo) Tio (ghir). (22.142) 
Note, most importantly, that the operator ‘(1d)(#l)’ in (22.142) has mass 
dimension five, which is why we introduced the factor M7! in the coupling; 
it is indeed a non-renormalizable effective interaction, just like the current— 
current one. We may interpret M as the mass scale at which ‘new physics’ 
enters, in the spirit of the discussion in section 11.7. Suppose, for the sake of 
argument, this was M ~ 1016 GeV (a scale typical of Grand Unified Theories). 
After symmetry breaking, then, (22.142) will generate the required Majorana 
mass term, with 


2 
mn ~ gem ~ gem 10-2 eV. (22.143) 


Thus an effective coupling of ‘natural’ size Jem ~ 0.1 emerges from this 
argument, if indeed the mass of the ve is of order 10~%eV. 

A more specific model can be constructed in which a relation of the form 
(22.143) can arise naturally. Suppose în is an R-type neutrino field which is 
an SU(2) x U(1) singlet, and which has a gauge-invariant Yukawa coupling 
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to the Higgs field, of the form (22.133). Then the Yukawa and the mass terms 
DR are 


E E E T O 
LY,R = —gr(leLocie + Dolce) = ¿Mal(0r)c VR + h.c.]. (22.144) 


Then, in the Higgs vacuum the first term in (22.144) becomes 
-mp (Dar + DrDeL) (22.145) 


where mp = gru/V2. The term (22.145) couples the fields ôr and DeL, so 
that we need to do a diagonalization to find the true mass eigenvalues and 
eigenstates. The combined mass terms from (22.144) and (22.145) can be 
written as 


im > 
-5 (Ñi)c M Ny + he. (22.146) 


Ri = ( a iF M= ( a ie | (22.147) 


CP invariance would imply that the parameters mp and mp are real, as we 
will assume, for simplicity. 

Suppose now that mp < mpr. Then the eigenvalues of M are approxi- 
mately 


where 


mi x mr, Mm% —md/ma. (22.148) 


The apparently troubling minus sign can be absorbed into the mixing param- 
eters. Thus one eigenvalue is (by assumption) very large compared to mp, 
and one is very much smaller. The vanishing of the first element in M ensures 
that the lepton number violating term (22.137) is characterized by a large 
mass scale mp. It may be natural to assume that mp is a ‘typical’ quark or 
lepton mass term, which would then imply that mz of (22.148) is very much 
lighter than that — as appears to be true for the neutrinos. This is the famous 
‘see-saw’ mechanism of Minkowski (1977), Gell-Mann et al. (1979), Yanagida 
(1979) and Mohapatra and Senjanovic (1980, 1981). If in fact ma ~ 10*Gev, 
we recover an estimate for ma which is similar to that in (22.143). It is worth 
emphasizing that the Majorana nature of the massive neutrinos is an essential 
part of the see-saw mechanism. 

These considerations are tending to take us ‘beyond the Standard Model’, 
so we shall not pursue them at any greater length. Instead, we must now 
generalize the discussion of fermion masses to the three-generation case. 


22.5.2 Three-generation mixing 


We introduce three doublets of left-handed quark fields 


A ani x ÚL2 A ÚL3 
= A , = A A = > 22.149 
qui ( es ) qL2 ( des ) qL3 ( dis ) ( ) 
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and the corresponding six singlets 
dri, dri, tra, dro, drs, drs, (22.150) 


which transform in the now familiar way under SU(2), x U(1). The ú-fields 
correspond to the t3 = +4 components of SU(2)L, the d ones to the tz = —4 
components, and to their ‘R’ partners. The labels 1, 2 and 3 refer to the 
family number; for example, with no mixing at all, dp, = úL, du = d, 
etc. We have to consider what is the most general SU(2), x U(1)-invariant 
interaction between the Higgs field (assuming we can still get by with only one) 
and these various fields. Apart from the symmetry, the only other theoretical 
requirement is renormalizability — for, after all, if we drop this we might as well 
abandon the whole motivation for the ‘gauge’ concept. This implies (as in the 


discussion of the Higgs potential V) that we cannot have terms like (77)? 
appearing — which would have a coupling with dimensions (mass)~* and would 
be non-renormalizable. In fact the only renormalizable Yukawa coupling is of 
the form ‘bud’, which has a dimensionless coupling (as in the ge and gv, of 
(22.125) and (22.133)). However, there is no a priori requirement for it to be 
‘diagonal’ in the weak interaction family index 7. The allowed generalization 
of (22.125) and (22.133) is therefore an interaction of the form (summing on 
repeated indices) 


Lug = aigdricting + disdLi ddr; + hic. (22.151) 
where 
E di 
dui = ( : ) (22.152) 
di 


and a sum on the family indices ¿ and j (from 1 to 3) in (22.151) is assumed. 
After symmetry breaking, using the gauge (22.29), we find (problem 22.6) 


A) - = A 
Lig = — ( + 4) [41 ¿mi tr; + di mi; drj + h.c.], (22.153) 


where the ‘mass matrices’ are 

v d v 
Although we have not indicated it, the m" and md matrices could involve 
a ‘ys’ part as well as a ‘1’ part in Dirac space. It can be shown (Weinberg 
1973, Feinberg et al. 1959) that m“ and mi can both be made Hermitean, 
“s-free, and diagonal by making four separate unitary transformations on the 
‘generation triplets’ 


ÚL = do 4 du = da „etc (22.155) 
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via 
dita = (U)aitra,  ûra = (UL asin (22.156) 
da = (U)eidti, dra = US adri. (22.157) 


In this notation, ‘a’ is the index of the ‘mass diagonal’ basis, and ‘2’ is that 
of the ‘weak interaction’ basis.” Then (22.153) becomes 


F H = >> 
Low = — (: de 4) [mudi +... + mpb]. (22.158) 
U 


Rather remarkably, we can still manage with only the one Higgs field. It 
couples to each fermion with a strength proportional to the mass of that 
fermion, divided by Mw. 

Now consider the SU(2),xU(1) gauge invariant interaction part of the 
Lagrangian. Written out in terms of the ‘weak interaction’ fields 41 ri and 
dirá (cf (22.43) and (22.44)), it is 


A ph E . = . > aj 
Lyw.p = i(ûLj, dj) (Ou + 1gT + W,/2 + ig’yB,/2) ( a ) 
J 
+ iúrjy" (0, + î9'y Bu /2)ânj + idaj (0, +ig'yB,/2)dr; 
(22.159) 


where a sum on j is understood. This now has to be rewritten in terms of the 
mass-eigenstate fields 4 Ra and di Ra: 

Problem 22.7 shows that the neutral current part of (22.159) is diagonal in 
the mass basis, provided the U matrices of (22.156) and (22.157) are unitary; 
that is, the neutral current interactions do not change the flavour of the physi- 
cal (mass eigenstate) quarks. The charged current processes, however, involve 
the non-diagonal matrices 7, and 72 in (22.159), and this spoils the argument 
used in problem 22.7. Indeed, using (22.47) we find that the charged current 
piece is 


A g = = A an; 
Si id A h.c. 
Loc fi Lalas, ( dy; )+ c 
9 x ae 
= y tat tee. 
= dra (OH Jas OO) jaly"dipWy + hoc, (22.160) 
V2 
where the matrix 
Vag = UM (22.161) 


is not diagonal, though it is unitary. This is the CKM matrix (Cabibbo 


350, for example, ûLa=t = ÎL, dua=s = $L, etc. 
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1963, Kobayashi and Maskawa 1973), originally introduced by Kobayashi 
and Maskawa in the context of their three-generation extension of the then- 
developing Standard Model, in order to provide room for CP violation within 
the SU(2) x U(1) gauge theory framework. The interaction (22.160) then has 
the form 


iul Lyra, + cât, + în] + hc., (22.162) 
where A 7 
A Vaa Vas Vab du 

SL. J= Vea Ves Ve SL |, (22.163) 
bt, Via Vis Vib br, 


with the phenomenology described in the previous chapter. 

An analysis similar to the above can be carried out in the leptonic sector. 
We would then have leptonic flavour mixing in charged current processes, 
via the leptonic analogue of the CKM matrix, namely the PMNS matrix 
(Pontecorvo 1957, 1958, 1967; Maki, Nakagawa and Sakata 1962); this is 
the matrix whose elements are probed in neutrino oscillations, as we saw in 
chapter 21. 


E: SeSe 


22.6 Higher-order corrections 


The Z° mass 
Mz = 91.1876 + 0.0021 GeV (22.164) 


has been determined from the Z-lineshape scan at LEP1 (Schael et al. 2006). 
The W mass is (Nakamura et al. 2010) 


Mw = 80.399 + 0.023 GeV. (22.165) 


The asymmetry parameter Ae (see (22.100)) is (Abe et al. (2000)) 


A. = 0.15138 + 0.00216 (22.166) 


from measurements at SLD. These are just three examples from the table 
of 36 observables listed in the review of the electroweak model by Erler and 
Langacker in Nakamura et al. (2010). Such remarkable precision is a tri- 
umph of machine design and experimental art — and it is the reason why we 
need a renormalizable electroweak theory. The overall fit to the data, includ- 
ing higher-order corrections, is generally very good, as quoted by Erler and 
Langacker with y?/d.o.f = 43.0/44. One of the few discrepancies is a 2.70 de- 
viation in the Z-pole forward-backward asymmetry AG») from LEP1; another 
is a 2.50 deviation in the muon anomalous magnetic moment, gą — 2. This 
strong numerical consistency lends impressive support to the belief that we 
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are indeed dealing with a renormalizable spontaneously broken gauge theory: 
renormalizable, because no extra parameters, not in the original Lagrangian, 
have had to be introduced; a gauge theory, because the fermion and gauge 
boson couplings obey the relations imposed by the local SU(2) x U(1) sym- 
metry; and spontaneously broken because the same symmetry is not seen in 
the particle spectrum (consider the mass separation in the t-b doublet, for 
instance). 


In fact, one can turn this around, in more than one way. First, one crucially 
important element in the theory — the Higgs boson — has a mass my which 
is largely unconstrained by theory (see section 22.8.2), and it is therefore a 
parameter in the fits. Some information about my can therefore be gained by 
seeing how the fits vary with my. Actually, we shall see in equation (22.181) 
that the dependence on my is only logarithmic — it acts rather like a cut-off, 
so the fits are not very sensitive to my. The 90 % central confidence range 
from all precision data is given by Erler and Langacker as 55 GeV< my < 
135 GeV. By contrast, some loop corrections are proportional to the square 
of the top mass (see (22.180)) and consequently very tight bounds could be 
placed on mu via its virtual presence (i.e. in loops, for example as shown 
in figure 22.25) before its real presence was confirmed, as we shall discuss 
shortly and in section 22.7. Secondly, it is still entirely possible that very 
careful analysis of small discrepancies between precision data and electroweak 
predictions may indicate the presence of ‘new physics’. 

After all this (and earlier) emphasis on the renormalizability of the elec- 
troweak theory, and the introduction to one-loop calculations in QED at the 
end of volume 1, the reader perhaps has a right to expect, now, an exposition 
of loop corrections in the electroweak theory. But the fact is that this is a very 
complicated and technical story, requiring quite a bit more formal machinery, 
which would be outside the intended scope of this book (suitable references in- 
clude Altarelli et al. 1989, especially the pedagogical account by Consoli et al. 
1989; and the equally approachable lectures by Hollik 1991). Instead, we want 
to touch on just a few of the simpler and more important aspects of one-loop 
corrections, especially insofar as they have phenomenological implications. 

As we have seen, we obtain cut-off independent results from loop correc- 
tions in a renormalizable theory by taking the values of certain parameters — 
those appearing in the original Lagrangian — from experiment, according to 
a well-defined procedure (‘renormalization scheme’). In the electroweak case, 
the parameters in the Lagrangian are 


gauge couplings g, g’ 22.167) 

Higgs potential parameters A, pu? 22.168) 
Higgs-fermion Yukawa couplings g; 22.169) 

CKM angles 012, 013,023; phase 6 (22.170) 

PMNS angles 6.2, 023,03, phase 6” (+ a21, 0317). 22.171) 
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The fermion masses and mixings, and the Higgs mass, can be separated off, 
leaving g, g’ and one combination of A and u? (for instance, the tree-level vac- 
uum value v). These three parameters are usually replaced by the equivalent 
and more convenient set 


a (Bouchiendra et al. 2011); (22.172) 


Gr (Marciano and Sirlin 1988, van Ritbergen and Stuart 1999), (22.173) 


(see also Nir 1989, Pak and Czarnecki 2008, Chitwood et al. 2007, and Barezyk 
et al. 2008); and 
Mz (Schael et al. 2006). (22.174) 


These are, of course, related to g,g’ and v; for example, at tree-level 


Lj 1 
az eg? +9%)-4r, Mz — z” g? gi? GF = Vive’ (22.175) 


but these relations become modified in higher order. The renormalized pa- 
rameters will ‘run’ in the way described in chapters 15 and 16; the running of 
a, for example, has been observed directly, as noted in section 11.5. 

After renormalization one can derive radiatively corrected values for phys- 
ical quantities in terms of the set (22.172)—(22.174) (together with my and the 
fermion masses and mixings). But a renormalization scheme has to be speci- 
fied, at any finite order (though in practice the differences are very small). One 
conceptually simple scheme is the ‘on-shell’ one (Sirlin 1980, 1984; Kennedy 
et al. 1989; Kennedy and Lynn 1989; Bardin et al. 1989; Hollik 1990; for 
reviews see Langacker 1995). In this scheme, the tree-level formula 


sin? Ow = 1 — Miy/MZ (22.176) 


is promoted into a definition of the renormalized sin? @w to all orders in per- 
turbation theory, it being then denoted by să: 


sy = 1 — Mi /Mz ~ 0.223. (22.177) 
The radiatively corrected value for Mw is then 


My, = Goes (22.178) 


where Ar includes the radiative corrections relating a, a(Mz), Gr, Mw and 
Mz. Another scheme is the modified minimal subtraction (MS) scheme (ap- 
pendix O) which introduces the quantity sin? Ow (u) = 9(1)/[9 (u) + 9? (u) 
where the couplings g and @’ are defined in the MS scheme and y is cho- 
sen to be Mz for most electroweak processes. Attention is then focused on 
32 = sin? 6w(Mz). This is the scheme used by Erler and Langacker in Naka- 
mura et al (2010). 
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We shall continue here with the scheme defined by (22.177). We cannot go 
into detail about all the contributions to Ar, but we do want to highlight two 
features of the result — which are surprising, important phenomenologically, 
and related to an interesting symmetry. It turns out (Consoli et al. 1989, 
Hollik 1991) that the leading terms in Ar have the form 


(1 = sw) 
Ar = Aro — — Ap + (Ar)rem- (22.179) 
Sw 
In (22.179), Aro = 1 — a/a(Mz) is due to the running of a, and has the value 
Aro = 0.0664(2) (see section 11.5.3). Ap is given by (Veltman 1977) 
_ 3Gp(mi — mi) 
7 8124/2 : 


while the ‘remainder’ (Ar)rem contains a significant term proportional to 
In(m:/mz), and a contribution from the Higgs boson which is (for my > Mw) 


Ap (22.180) 


V2Gr MZ, 11 me 5 
(Ar) rem,H Y a ae [m (z) - ;| l (22.181) 


As the notation suggests, Ap is a leading contribution to the parameter p 
introduced in (22.66). As explained there, it measures the strength of neutral 
current processes relative to charged current ones. Ap is then a radiative cor- 
rection to p. It turns out that, to good approximation, electroweak radiative 
corrections in ete” — Z° + f f can be included by replacing the fermionic 
couplings gf and gi (see (22.64), (22.75) and (22.76)) by 


KH, = Volt? — 2Q pr ps2) (22.182) 
and 
ah = yor” (22.183) 


together with corrections to the Z°-propagator. The corrections have the 
form (in the on-shell scheme) py ~ 1+ Ap (of equation (22.180)) and Ky = 
1+ as Ae, for f Z b,t. For the b-quark there is an additional contribution 
coming from the presence of the virtual top quark in vertex corrections to 
Z > bb (Akhundov et al. 1986, Beenakker and Hollik 1988). 

The running of a in Aro is expected, but (22.180) and (22.181) contain 
surprising features. As regards (22.180), it is associated with top-bottom 
quark loops in vacuum polarization amplitudes, of the kind discussed for me 
in section 11.5, but this time in weak boson propagators. In the QED case, re- 
ferring to equation (11.39) for example, we saw that the contribution of heavy 
fermions ‘(|q?| < m4)’ was suppressed, appearing as O(|g”|/m%). In such a 
situation (which is the usual one) the heavy particles are said to ‘decouple’. 
But the correction (22.180) is quite different, the fermion masses being in the 
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FIGURE 22.25 
t - b vacuum polarization contribution. 


numerator. Clearly, with a large value m+, this can make a relatively big dif- 
ference. This is why some precision measurements are surprisingly sensitive 
to the value of m+, in the range near (as we now know) the physical value. 
Secondly, as regards the dependence on my, we might well have expected it 
to involve mă in the numerator if we considered the typical divergence of a 
scalar particle in a loop (we shall return to this after discussing (22.180)). Ar 
would then have been very sensitive to my, but in fact the sensitivity is only 
logarithmic. 

We can understand the appearance of the fermion masses (squared) in the 
numerator of (22.180) as follows. The shift Ap is associated with vector boson 
vacuum polarization contributions, for example the one shown in figure 22.25. 
Consider in particular the contribution from the longitudinal polarization 
components of the W’s. As we have seen, these components are nothing but 
three of the four Higgs components which the WF and ZO ‘swallowed’ to be- 
come massive. But the couplings of these ‘swallowed’ Higgs fields to fermions 
are determined by just the same Higgs-fermion Yukawa couplings as we in- 
troduced to generate the fermion masses via spontaneous symmetry breaking. 
Hence we expect the fermion loops to contribute (to these longitudinal W 
states) something of order 9 /4m where gf is the Yukawa coupling. Since 
gf ~ mţ/v (see (22.127)) we arrive at an estimate ~ m4/4rv? ~ Gpm%/47 
as in (22.180). An important message is that particles which acquire their 
mass spontaneously do not ‘decouple’. 

But we now have to explain why Ap in (22.180) would vanish if m? = m2, 
and why only Inm?, appears in (22.181). Both these facts are related to 
a symmetry of the assumed minimal Higgs sector which we have not yet 
discussed. Let us first consider the situation at tree level, where p = 1. It 
may be shown (Ross and Veltman 1975) that p = 1 is a natural consequence 
of having the symmetry broken by an SU(2), doublet Higgs field (rather than 
a triplet, say) — or indeed by any number of doublets. The nearness of the 
measured p parameter to 1 is, in fact, good support for the hypothesis that 
there are only doublet Higgs fields. Problem 22.8 explores a simple model 
with a Higgs field in the triplet representation. 

At tree level, it is simplest to think of p in connection with the mass ratio 
(22.66). To see the significance of this, let us go back to the Higgs-gauge field 
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Lagrangian Las of (22.30) which produced the gauge boson masses. With 
the doublet Higgs of the form (22.131), it is a striking fact that the Higgs 
potential only involves the highly symmetrical combination of fields 


++ 02 + e, (22.184) 


as does the vacuum condition (17.102). This suggests that there may be 
some extra symmetry in (22.30) which is special to the doublet structure. 
But of course, to be of any interest, this symmetry has to be present in the 
(D,6)' (Ô! 4) term as well. 

The nature of this symmetry is best brought out by introducing a change 
of notation for Higgs doublet o* and ¢°: instead of (22.131), we now write 
(cf (18.70) 


2 ( (ta + itt1)/V2 
o= ( (6 —its)/V2 ) (22.185) 
while the dc field of (22.132) becomes 
3 ( (6+it3)/v2 
oc = ( BART, ) (22.186) 


We then find that these can be written as 


b= gt tira) | i i; be = Tele +ir-#) (3 ). (22.187) 


Consider now the covariant SU(2), x U(1) derivative acting on @, as in (22.30), 
and suppose to begin with that g’ = 0. Then 


Ao 1 A nren as O 
Due = pt, tior: pe riza (7) 
a, Pl E aa Ch acta atatea SĂ 
= yO tir: Ont +1567 W, 


z Ltt W, tir W, x îl) ( i ) (22.188) 


using 7;7; = dij + i€ijnTe. Now the vacuum choice (22.28) corresponds to 
6 =v,% = 0, so that when we form (D,¢)'(D"@) from (22.188) we will get 
just 


50.1) er tr: w^) ( : ) = SW, -W" (22.189) 


with Mw = gv/2 as usual. The condition g' = 0 corresponds (cf (22.39)) 
to an = 0, and thus to W3, = Zu, and so (22.189) says that in the limit 
of gl > 0, Mw = Mz, as expected if cos@w = 1. It is clear from (22.188) 
that the three components W. are treated on a precisely equal footing by the 
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Higgs field (22.185), and indeed the notation suggests that W,, and 7 should 
perhaps be regarded as some kind of new triplets. 

It is straightforward to calculate (D Dd)! (D"¢) from (22.188); one finds 
(problem 22.9) 


> 7 1 1 X 
(Due DIS = 5(0,0) + 5 (Out)? — 20, oi Ww" 
+ Sao, Ww" + Loe. (Ax W") 
Pur a2 
+ zW +15. (22.190) 


This expression now reveals what the symmetry is: (22.190) is invariant under 
global SU(2) transformations under which W, and 7 are vectors — that is 
Ŵ, =} W. +ex W, 
î—î+exî (22.191) 
6-6 


This is why, from the term wo? all three W fields have the same mass in 
this g' — 0 limit. 

If we now reinstate y”, and use (22.36) and (22.37) to write W3, and By 
in terms of the physical fields Zi, and Au as in (19.96), (22.188) becomes 


1 TA ATD Zi 1 + 73 a 
a {On iu ip Ela, + iu rrr ee Ow +igsin Oe ( 5 JE 


= ig ae) 14+ 73 as A A 0 
ea sin av | 5 Ze + ir a( 1 ) (22.192) 


We see from (22.192) that y” 4 0 has two effects. First, there is a ‘r - W'- 
like term, as in (22.188), except that the 'W3” part of it is now Z/cosOw. 
In the vacuum 6 = v, = 0 which simply means that the mass of the Z is 
Mz = Mw/ cos Ow i.e. p = 1; and this relation is preserved under ‘rotations’ 
of the form (22.191), since they do not mix î and G. Hence this mass relation 
(and p = 1) is a consequence of the global SU(2) symmetry of the interactions 
and the vacuum under (22.191), and of the relations (22.36) and (22.37) which 
embody the requirement of a massless photon. 

On the other hand, there are additional terms in (22.192) which single 
out the ‘T3’ component, and therefore break this global SU(2). These terms 
vanish as g’ — 0, and do not contribute at tree level, but we expect that they 
will cause O(g'2) corrections to p = 1 at the one loop level. 

None of o above, however, yet involves the quark masses, and the ques- 
tion of why m? — b? appears in the numerator in (22.180). We can now answer 
this question. Conte a typical mass term, of the form discussed in section 
22.5.2, for a quark doublet of the it! family 


Lin = —94(tidis)octr: = g- (aid) dpi. (22.193) 
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Using (22.185) and (22.186), this can be written as 


E —9+ = Gh yen s af ÛRi I- z i Sir. A 0 
fn = E idas + i-a) i ) — la am) ( dri ) 
Ce eee ae en Uri 
= ap (Oaie cir ` i) T3 ( dri ) ¿ (22.194) 


Consider now a simultaneous (infinitesimal) global SU(2) transformation 
on the two doublets (ûr:, dr)! yes 


ûLi a di ÛRi , ÚRi 
F — je- A À l—ie-7/2 | 
(oa ie (i), (a) + ie ea) 
(22.195) 
Under (22.195), the first term of (22.194) becomes (to first order in e) 


and (iri, dri 


(g+ +9-) 
2/2 


From (22.196) we see that if, at the same time as (22.195), we also make 
the transformation of m given in (22.191), then this first term in Êm will be 
invariant under these combined transformations. The second term in (22.194), 
however, will not be invariant under (22.195), but only under transformations 
with e, = €2 = 0,€3 4 0. We conclude that the global SU(2) symmetry of 
(22.191), which was responsible for p = 1 at the tree level, can be extended 
also to the quark sector; but — because the g+ in (22.193) are proportional to 
the masses of the quark doublet — this symmetry is explicitly broken by the 
quark mass difference. This is why a t-b loop in a W vacuum polarization 
correction can produce the ‘non-decoupled’ contribution (22.180) to p, which 
grows as mí — m and produces quite detectable shifts from the tree-level 
predictions, given the accuracy of the data. 

Returning to (22.195), the transformation on the L-components is just 
the same as a standard SU(2), transformation, except that it is global; so 
the gauge interactions of the quarks obey this symmetry also. As far as 
the R-components are concerned, they are totally decoupled in the gauge 
dynamics, and we are free to make the transformation (22.195) if we wish. 
The resulting complete transformation, which does the same to both the L 
and R components, is a non-chiral one — in fact it is precisely an ordinary 
‘isospin’ transformation of the type 


( d, ) > (1 — ie- 7/2) ( T ) ; (22.197) 
The reader will recognize that the mathematics here is exactly the same as 
that in section 18.3, involving the SU(2) of isospin in the o-model. This 


— (Gridii) (6 +it-(7#+7 xe) ( A ) f (22.196) 
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FIGURE 22.26 oe 
One-boson self-energy graph in (¢'¢)?. 


analysis of the symmetry of the Higgs ( or a more general symmetry breaking 
sector) was first given by Sikivie et al. (1980). The isospin-SU(2) is frequently 
called ‘custodial SU(2)’ since it ‘protects’ p = 1. 

What about the absence of mă corrections? Here the position is rather 
more subtle. Without the Higgs particle H the theory is non-renormalizable, 
and hence one might expect to see some radiative correction becoming very 
large (O(m4)) as one tried to ‘banish’ H from theory by sending my — 00 
(my would be acting like a cut-off). The reason is that in such a (414)? theory, 
the simplest loop we meet is that shown in figure 22.18, and it is easy to see by 
counting powers, as usual, that it diverges as the square of the cut-off. This 
loop contributes to the Higgs self-energy, and will be renormalized by taking 
the value of the coefficient of td in (22.30) from experiment. We will return 
to this particular detail in section 22.8.1. 

Even without a Higgs contribution however, it turns out that the elec- 
troweak theory is renormalizable at the one-loop level if the fermion masses are 
zero (Veltman 1968,1970). Thus one suspects that the large mă effects will not 
be so dramatic after all. In fact, calculation shows (Veltman 1977; Chanowitz 
et al. 1978, 1979) that one-loop radiative corrections to electroweak observ- 
ables grow at most like In mg for large mu. While there are finite corrections 
which are approximately O(mj,) for mă < My z for mă > My z the O(mj;) 
pieces cancel out from all observable quantities*, leaving only In mă terms. 
This is just what we have in (22.181), and it means, unfortunately, that the 
sensitivity of the data to this important parameter of the Standard Model is 
only logarithmic. Fits to data typically give my in the region of 90 GeV at 
the minimum of the x? curve, but the error (which is not simple to interpret) 
is of the order of 25 GeV. 

At the two-loop level, the expected O(m4) behaviour becomes O(mj;) 
instead (van der Bij and Veltman 1984, van der Bij 1984) — and of course 
appears (relative to the one-loop contributions) with an additional factor of 
O(a). This relative insensitivity of the radiative corrections to my, in the 
limit of large my, was discovered by Veltman (1977) and called a ‘screening’ 
phenomenon by him: for large my (which also means, as we have seen, large 
A) we have an effectively strongly interacting theory whose principal effects are 


4 Apart from the été coefficient! See section 22.8.1. 
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screened off from observables at lower energy. It was shown by Einhorn and 
Wudka (1989) that this screening is also a consequence of the (approximate) 
isospin-SU(2) symmetry we have just discussed in connection with (22.180). 
Phenomenologically, the upshot is that it was unfortunately very difficult to 
get an accurate handle on the value of my from fits to the precision data. 
With the top quark, the situation was very different. 


re a 


22.7 The top quark 


Having drawn attention to the relative sensitivity of radiative connections to 
loops containing virtual top quarks, it is worth devoting a little space to a 
‘backward glance’ at the year immediately prior to the discovery of the t- 
quark (Abe et al. 1994a, b, 1995b, Abachi et al. 1995b) at the CDF and DO 
detectors at FNAL’s Tevatron, in p — p collisions at Ean = 1.8 TeV. 

The W and Z particles were, as we have seen, discovered in 1983 and at 
that time, and for some years subsequently, the data were not precise enough 
to be sensitive to virtual t-effects. In the late 1980’s and early 1990’s, LEP at 
CERN and SLC at Stanford began to produce new and highly accurate data 
which did allow increasingly precise predictions to be made for the top quark 
mass, mu. Thus a kind of race began, between experimentalists searching for 
the real top, and theorists fitting ever more precise data to get tighter and 
tighter limits on mu, from its virtual effects. 

In fact, by the time of the actual experimental discovery of the top quark, 
the experimental error in m, was just about the same as the theoretical one 
(and — of course — the central values were consistent). Thus, in their May 
1994 review of the electroweak theory (contained in Montanet et al. 1994, p 
1304ff) Erler and Langacker gave the result of a fit to all electroweak data as 


ms = 169 +18 +17 GeV, (22.198) 


the central figure and first error being based on my = 300 GeV, the second 
(+) error assuming my = 1000 GeV and the second (—) error assuming my = 
60 GeV.? At about the same time, Ellis et al. (1994) gave the extraordinarily 
precise value 


mM, = 162+9 GeV (22.199) 


without any assumption for mu. 

A month or so earlier, the CDF collaboration (Abe et al. 1994a,b) an- 
nounced 12 events consistent with the hypothesis of production of a tt pair, 
and on this hypothesis the mass was found to be 


m, =174+10+% GeV, (22.200) 


5The relatively small effect of large variations in my illustrates the lack of sensitivity to 
virtual Higgs effects, noted in the preceding section. 


420 22. The GSW Gauge Theory of Electroweak Interactions 


and this was followed by nine similar events from DO (Abachi et al. 1995a). 
By February 1995 both groups had amassed more data and the discovery was 
announced (Abe et al. 1995b, Abachi et al. 1995b). The 2010 experimental 
value for my is 173.141.3 GeV (Nakamura et al. 2010) as compared to the value 
predicted by fits to the electroweak data of 173.2+1.3 GeV. This represents an 
extraordinary triumph for both theory and experiment. It is surely remarkable 
how the quantum fluctuations of a yet-to-be-detected new particle could pin 
down its mass so precisely. It seems hard to deny that Nature has indeed made 
use of the subtle intricacies of a spontaneously broken non-Abelian gauge 
theory. 

One feature of the ‘real’ top events in particularly noteworthy. Unlike the 
mass of the other quarks, m, is greater than Mw, and this means that it can 
decay to b + W via real W emission: 


t= Wt +b. (22.201) 


In contrast, the b quark itself decays by the usual virtual W processes. Now we 
have seen that the virtual process is supressed by ~ 1/M¥, if the energy release 
(as in the case of b-decay) is well below Mw. But the real process (22.201) 
suffers no such suppression and proceeds very much faster. In fact (problem 
22.10) the top quark lifetime from (22.201) is estimated to be ~ 4 x 1072 
s! This is quite similar to the lifetime of the W* itself, via WT — ety, for 
example. Consider now the production of a tt pair in the collision between 
two partons. As the t and t separate, the strong interactions which should 
eventually ‘hadronize’ them will not play a role until they are ~ 1 fm apart. 
But if they are travelling close to the speed of light, they can only travel some 
10716 m before decaying. Thus t's tend to decay before they experience the 
confining QCD interactions, a point we also made in section 1.2. Instead, the 
hadronization is associated with the b quark, which has a more typical weak 
lifetime (~ 1.5 x 10-12 s). By the same token, this fast decay of the t quark 
means that there will be no detectable tt ‘toponium’, bound by QCD. 

With the t quark safely real, the Higgs boson was the one remaining miss- 
ing particle in the Standard Model complement, and its discovery was of the 
utmost importance. We end this book with a brief review of Higgs physics 
and the experiments leading to the probable discovery of this long-awaited 
particle in 2012. 


În a 


22.8 The Higgs sector 


It is worth noting that an essential feature of the type of theory which has 
been described in this note is the prediction of incomplete multiplets of 
scalar and vector bosons. 


—P W Higgs (1964) 


22.8. The Higgs sector 421 
22.8.1 Introduction 


The Lagrangian for an unbroken SU(2)L x U(1) gauge theory of vector bosons 
and fermions is rather simple and elegant, all the interactions being deter- 
mined by just two Lagrangian parameters g and g’ in a ‘universal’ way. All 
the particles in this hypothetical world are, however, massless. In the real 
world, while the electroweak interactions are undoubtedly well described by 
the SU(2), x U(1) theory, neither the mediating gauge quanta (apart from the 
photon) nor the fermions are massless. They must acquire mass in some way 
that does not break the gauge symmetry of the Lagrangian, or else the renor- 
malizability of the theory is destroyed, and its remarkable empirical success (at 
a level which includes loop corrections) would be some kind of freak accident. 
In chapter 19 we discussed how such a breaking of a gauge symmetry does 
happen, dynamically, in a superconductor. In that case ‘electron pairing’ was 
a crucial ingredient. In particle physics, while a lot of effort has gone into ex- 
amining various analogous ‘dynamical symmetry breaking’ theories, none has 
yet emerged as both theoretically compelling and phenomenologically viable. 
However, a simple count of the number of degrees of freedom in a massive vec- 
tor field, as opposed to a massless one, indicates that some additional fields 
must be present in order to give mass to the originally massless gauge bosons. 
And so, in the Standard Model, it is simply assumed, following the original 
ideas of Higgs and others (Higgs 1984, Englert and Brout 1964, Guralnik et 
al. 1964; Higgs 1966) that a suitable scalar (‘Higgs’) field exists, with a po- 
tential which causes the ground state (the vacuum) to break the symmetry 
spontaneously. Furthermore, rather than (as in BCS theory) obtaining the 
fermion mass gaps dynamically, they too are put in ‘by hand’ via Yukawa-like 
couplings to the Higgs field. 


It has to be admitted that this part of the Standard Model appears to be 
the least satisfactory. Consider the Higgs couplings, which are listed in ap- 
pendix Q, section Q.2.3. While the couplings of the Higgs field to the gauge 
fields are all determined by the gauge symmetry, the Higgs self-couplings 
(trilinear and quadrilinear) are not gauge interactions and are unrelated to 
anything else in the theory. Likewise, the Yukawa-like fermion couplings are 
not gauge interactions either, and they are both unconstrained and uncom- 
fortably different in orders of magnitude. True, all these are renormalizable 
couplings — but this basically means that their values are not calculable and 
have all to be taken from experiment. 


Such considerations may indicate that the ‘Higgs Sector’ of the Standard 
Model is on a somewhat different footing from the rest of it — a commonly held 
view, indeed. Perhaps it should be regarded as more a ‘phenomenology’ than 
a ‘theory’, much as the current—current model was. In this connection, we may 
mention a point which has long worried many theorists. In section 22.6 we 
noted that figure 22.26 gives a quadratically divergent (O(A2)) and positive 
contribution to the bid term in the Lagrangian, at one loop order. This term 
would ordinarily, of course, be just the mass term of the scalar field. But in 
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the Higgs case, the matter is much more delicate. The whole phenomenology 
depends on the renormalized coefficient having a negative value, triggering 
the spontaneous breaking of the symmetry. This means that the O(A?) one- 
loop correction must be cancelled by the ‘bare’ mass term Im? oô so as to 
achieve a negative coefficient of order —v?. This cancellation between mă y 
and A? will have to be very precise indeed if A — the scale of ‘new physics’ — 
is very high, as is commonly assumed (say 1016 GeV). 

The reader may wonder why attention should now be drawn to this par- 
ticular piece of renormalization: aren’t all divergences handled this way? In a 
sense they are, but the fact is that this is the first case we have had in which 
we have to cancel a quadratic divergence. The other mass-corrections have all 
been logarithmic, for which there is nothing like such a dramatic ‘fine-tuning’ 
problem. There is a good reason for this in the case of the electron mass, 
which we remarked on in section 11.2. Chiral symmetry forces self-energy 
corrections for fermions to be proportional to their mass, and hence to con- 
tain only logarithms of the cut-off. Similarly, gauge invariance for the vector 
bosons prohibits any O(A?) connections in perturbation theory. But there 
is no symmetry, within the Standard Model, which ‘protects’ the coefficient 
of ord in this way. It is hard to understand what can be stopping it from 
being of order A?, if we take the apparently reasonable point of view that the 
Standard Model will ultimately fail at some scale A where new physics enters. 
Thus the difficulty is: why is the empirical parameter v ‘shielded’ from the 
presumed high scale of new physics? This ‘problem’ is often referred to as the 
‘hierarchy problem’, or the ‘fine-tuning problem’. We stress again that we are 
dealing here with an absolutely crucial symmetry-breaking term, which one 
would really like to understand far better. 

Of course, the problem would go away if the scale A were as low as, say 
a few TeV. As we shall see in the next section this happens to be, not ac- 
cidentally, the same scale at which the Standard Model ceases to be a per- 
turbatively calculable theory. Various possibilities have been suggested for 
the kind of physics that might enter at energies of a few TeV. For example, 
‘technicolour’ models (Peskin 1997) regard the Higgs field as a composite of 
some new heavy fermions, rather like the BCS-pairing idea referred to ear- 
lier. A second possibility is supersymmetry (Aitchison 2007), in which there 
is a ‘protective’ symmetry operating, since scalar fields can be put alongside 
fermions in supermultiplets, and benefit from the protection enjoyed by the 
fermions. A third possibility is that of ‘large’ extra dimensions (Antoniadis 
2002). 

These undoubtedly fascinating ideas obviously take us well beyond our 
proper subject to which we must now return. Whatever may lie ‘beyond’ 
it, the Lagrangian of the Higgs sector of the Standard Model leads to many 
perfectly definite predictions which may be confronted with experiment, as we 
shall briefly discuss in section 22.8.3 (for a full account see Dawson et al. 1990, 
and for more compact ones see Ellis, Stirling and Webber 1996, chapter 11, 
and the review by Bernardi, Carena and Junk in Beringer et al. 2012). The 
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elucidation of the mechanism of gauge symmetry breaking is undoubtedly of 
the greatest importance to particle physics: quite apart from the SU(2), x U(1) 
theory, very many of the proposed theories which go ‘beyond the Standard 
Model’ face a similar ‘mass problem’, and generally appeal to some variant of 
the ‘Higgs mechanism’ to deal with it. 

As Higgs noted in the final paragraph of his 2-page Letter (Higgs 1964), an 
essential feature of the spontaneous symmetry breaking mechanism, in a gauge 
theory, is the appearance of incomplete multiplets of both scalar and vector 
bosons. Let us just rehearse this once more, in the SU(2) x U(1) case. We 
started with 4 massless gauge fields, belonging to an SU(2) triplet and a U(1) 
singlet; and, in addition, 4 scalar fields of equal mass, in an SU(2) doublet. 
After symmetry breaking, three massive vector bosons emerged, leaving the 
photon massless. In the scalar sector, three of the scalars became the longi- 
tudinal components of the three massive vector bosons, and one lone massive 
scalar field survived, all that remained of the original scalar doublet. Its mass 
is a free parameter of the theory, being given by my = V2u = VAv /v2. The 
discovery — or otherwise — of this Higgs boson has therefore been a vital goal in 
particle physics for over forty years. Before turning to experiment, however, 
we want to mention some theoretical considerations concerning my by way of 
orientation. 


22.8.2 Theoretical considerations concerning my 


The coupling constant A, which determines my given the known value of v, 

is unfortunately undetermined in the Standard Model. However, some quite 

strong theoretical arguments suggest that my cannot be arbitrarily large. 
Like all coupling constants in a renormalizable theory, A must ‘run’. For 

the (¿tp? interaction of (22.30), a one-loop calculation of the 3-function leads 

to 

Xv) 


1-20 1n(E/v) 


872 


A(E) = a (22.202) 
Like QED, this theory is not asymptotically free: the coupling increases with 
the scale E. In fact, the theory becomes non-perturbative at the scale E* such 
that 3 
87 
E* ~ vexp | — }. 22.203 
P (50) vee 
Note that this is exponentially sensitive to the ‘low-energy’ coupling constant 
A(v) — and that E* decreases rapidly as \(v) increases. But (see (22.40)) my is 
essentially proportional to \!/?(v). Hence as my increases, non-perturbative 
behaviour sets in increasingly early. Suppose we say that we should like per- 
turbative behaviour to be maintained up to an energy scale A. Then we 
require 


4 2 1/2 
i | (22.204) 


ma <o | 
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For A ~ 1016 GeV, this gives my < 160 GeV. On the other hand, if the 
non-perturbative regime sets in at 1TeV, then the bound on my is weaker, 
mu < 750 GeV. 

This is an oversimplified argument for various reasons, though the essential 
point is correct. An important omission is the contribution of the top quark to 
the running of A(E). A more refined version (Hambye and Riesselmann 1997) 
concludes that for my < 180 GeV the perturbative regime could extend all 
the way to the Planck mass, ~ 1019 GeV. 

There is another, independent, argument which suggests that my cannot 
be too large. We have previously considered violations of unitarity by the 
lowest-order diagrams for certain processes (see chapter 21 and section 22.6). 
As we saw, in a non-gauge theory with massive vector bosons, such violations 
are associated with the longitudinal polarization states of the bosons, which 
carry factors proportional to the 4-momentum k” (see (22.18)). In a gauge 
theory, strong cancellations in the high energy behaviour occur between differ- 
ent lowest-order diagrams. This behaviour is characteristic of gauge theories 
(Llewellyn Smith 1973, Cornwall et al. 1974), and is related to their renor- 
malizability. One process of this sort which we did not yet consider, however, 
is that in which two longitudinally polarized W’s scatter from each other. A 
considerable number of diagrams (7 in all) contribute to this process, in lead- 
ing order : exchange of y, Z and Higgs particles, together with the W-W self 
interaction. When all these are added up the high-energy behaviour of the 
total amplitude turns out to be proportional to A, the Higgs coupling constant 
(see for example Ellis, Stirling and Webber 1996, chapter 8). This at first sight 
unexpected result can be understood as follows. The longitudinal components 
of the W’s arise from the “04” parts in (22.30) (compare equation (19.48) in 
the U(1) case), which produce k* factors. Thus the scattering of longitudinal 
W’s is effectively the scattering of the 3 Goldstone bosons in the complex 
Higgs doublet. These bosons have self interactions arising from the Moto)? 
Higgs potential, for which the Feynman amplitude is just proportional to A. 
Now, although such a constant term obviously cannot violate unitarity as the 
energy increases (as happened in the other cases), it can do so if A itself is too 
big — and since A x mă, this puts a bound on my. A constant amplitude is 
pure J = 0 and so, in order of magnitude, we expect unitarity to imply  < 1. 
In terms of standard quantities, 


A =m24Gp/v2, (22.205) 


and so we expect 
mu < Gp ~ 300 GeV, (22.206) 


an energy scale we have seen several times before. A more refined analysis 
(Lee et al. 1977a,b) gives 


1/2 

2 

mu < 3/27) TeV. (22.207) 
3GF 
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Like the preceding argument, this one does not say that my must be less 
than some fixed number. Rather, it says that if my gets bigger than a certain 
value, perturbation theory will fail, or ‘new physics’ will enter. It is, in fact, 
curiously reminiscent of the original situation with the four-fermion current- 
current interaction itself (compare (22.10) with (22.206)). Perhaps this is a 
clue that we may eventually need to replace the Higgs phenomenology. At all 
events, this line of reasoning seems to imply that the Higgs boson will either 
be found at a mass well below 1 TeV, or else some electroweak interactions 
will become effectively strong with new physical consequences. This ‘no lose’ 
situation provided powerful motivation for the construction of the LHC. 

There is also an interesting lower bound on the Higgs mass, which is de- 
rived from the requirement of vacuum stability. If the Higgs mass is suffi- 
ciently lighter than the top quark mass, the top quark loop contribution to 
the running of the quartic coupling A(£) can cause the coupling to go nega- 
tive at large energy scales (Cabibbo et al. 1979). This would imply that, at 
such scales, the effective scalar potential of the Standard Model would be un- 
bounded below at large absolute values of the field, and there would no longer 
be a stable ground state (vacuum). This can be tolerated if the lifetime of the 
metastable vacuum is less that the age of the Universe (see Isidori et al. 2001, 
and references cited therein). A re-examination of the issue by Elias-Miro et 
al. (2012) showed that the Standard Model vacuum would become unstable 
at scales around the Planck mass, for my < 130 GeV. For my ~ 125 GeV, 
instability occurs at scales of order 1010 GeV, but the lifetime is greater than 
the age of the Universe. Of course, new physics may enter well before such a 
scale. It is nevertheless intriguing that a Higgs mass in this region may have 
implications for the physics of the early Universe. 

We now consider some simple aspects of Higgs boson production and de- 
cay processes at collider energies, as predicted by the Standard Model, and 
conclude with the experiments leading to the probable Higgs boson discovery 
in 2012. 


22.8.3 Higgs boson searches and the 2012 discovery 


We begin by considering the main production and decay modes. The existing 
lower bound on my established at LEP (LEP 2003) 


my > 114.4 GeV (95% Confidence Level) (22.208) 


already excluded many possibilities in both production and decay. Subsequent 
searches were carried out at the hadron colliders. At both the Tevatron and 
the LHC, the dominant parton-level production mechanism is ‘gluon fusion’ 
via an intermediate top quark loop as shown in figure 22.27 (Georgi et al. 
1978, Glashow et al. 1978, Stange et al. 1994a,b). The intermediate t quark 
dominates, because the Higgs couplings to fermions are proportional to the 
fermion mass. Since the gluon probability distribution rises rapidly at small 
x values, which are probed at larger collider energy y's, the cross section for 
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FIGURE 22.27 
Higgs boson production process by ‘gluon fusion’. 


FIGURE 22.28 
Higgs boson production process by ‘vector boson fusion’. 


this process (which is the same for pp and pp colliders) will rise with energy. 
At the Tevatron with ys = 1.96 TeV, the cross section ranges from about 
1 pb for my ~ 100 GeV to 0.2 pb for my ~ 200 GeV. At an LHC energy 
of \/s = 7 TeV, the cross section is about 25 pb for my ~ 100 GeV and 0.1 
pb for my œ 700 GeV, rising to about 70 pb and 1 pb respectively at ys = 
14 TeV (Dittmaier et al. 2011). These numbers include QCD corrections, 
which increase the parton-level cross sections by a factor of about 2. 

The next largest cross sections, some ten times smaller than the gluon 
fusion process, are for Higgs production via ‘vector boson fusion’ (qq! > qq’H, 
see figure 22.28) and for associated production of a Higgs boson with a vector 
boson (qq > WH, ZH, see figure 22.29). 

These processes involve the trilinear Higgs couplings to the vector bosons, 
which are proportional to their masses (see appendix Q). At the LHC, the first 
of these cross sections is somewhat larger than the second for my < 130 GeV, 
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FIGURE 22.29 
Higgs boson production in association with W or Z. 


FIGURE 22.30 
Higgs boson production in association with a t t pair. 


while the order is reversed at the Tevatron because the initial state is pp. 
A fourth production possibility, at a significantly smaller rate, is ‘associated 
production with top quarks’ as shown in figure 22.30, for example. Figure 
22.31 (taken from Ellis, Stirling and Webber 1996) shows the cross sections 
for the various production processes as a function of my, for pp collisions 
at ys = 14 TeV. Updated calculations (including QCD and electroweak 
corrections) are described in reports by Dittmaier et al. (2011, 2012), which 
present the results of a very large-scale theoretical effort. 

The Higgs boson must be detected via its decays. For my < 135 GeV, 
decays to fermion—antifermion pairs dominate, of which bb has the largest 
branching ratio because of the larger value of mp; the decay to 747” is roughly 
an order of magnitude smaller. The width of H > ff is easily calculated to 
lowest order and is (problem 22.11) 


/2 
_ OGem2 4m2\° 
LESA dude (: Z m) (22.209) 


where the colour factor C is 3 for quarks and 1 for leptons. For such my 


values, [(H > ff) is less than 5 MeV, and the total decay width is less than 
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FIGURE 22.31 

Higgs boson production cross sections in pp collisions at the LHC (figure 
from R K Ellis, W J Stirling and B R Webber QCD and Collider Physics 
1996, courtesy Cambridge University Press). 


10 MeV. QCD corrections are largely accounted for by replacing má in the first 
factor on the right-hand side of (22.209), which arises from the Higgs-fermion 
Yukawa coupling, by the MS running mass value ms’ (my). 

However, the large rate for the process ga — H — bb has to compete 
against a very large background from the inclusive production of pp (or pp) > 
bb+X via the strong interaction. The Higgs signal can be separated from such 
a background by using subleading decay modes such as H > yy. The Higgs 
boson’s coupling to photons is induced by quark triangle loops (figure 22.32) 
or a W loop. In a similar way, the associated production modes W+H, ZH, 
allow use of the leptonic W and Z decays to reject QCD backgrounds. 

Decays to a pair of vector bosons are also important. The tree-level width 
for H > W+W” is (problem 22.11) 


3 4M2, \ 1? 4M2 Mi 
ra > w+w-) = SETA (: ers ) ( E itt) | 
8nV/2 m 
(22.210) 


my 
and the width for H > ZZ is the same with Mw — Mz and a factor of > to 
allow for the two identical bosons in the final state. These widths rise rapidly 
with my, reaching T ~ 1 GeV when my ~ 200 GeV. Even for values of 
mu below the physical W+W” and ZZ thresholds, H can still decay through 
modes mediated by virtual bosons, via the off-shell decays H > WW* and 
H > ZZ*. 
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Ol 


FIGURE 22.32 
Higgs boson decay via quark triangle. 


Figure 22.33, taken from Ellis, Stirling and Webber (1996), shows the 
complete set of phenomenologically relevant Higgs branching ratios for a Higgs 
boson with my < 200 GeV. Updated results for SM Higgs branching ratios 
are reported in Dittmaier et al. (2012). 

We turn now to the experiments. The Tevatron pp collider at Fermilab 
operated at vs = 1.96 TeV until its shutdown in 2011. Higgs searches were 
conducted by two experiments, CDF and D0, which each collected approx- 
imately 10 fb-1 of data with the capability of seeing a Higgs signal in the 
mass range 90-185 GeV. The analyses searched for a Higgs boson produced 
through gluon fusion, in association with a vector boson, and through vector 
boson fusion. The decays H > bb, H > WtW-, H > ZZ, H > r+7~ and 
H > yy were all studied. 

The LHC is a pp collider at CERN which started running in 2010. The two 
general purpose detectors ATLAS (‘A Toroidal LHC ApparatuS’) and CMS 
(‘Compact Muon Solenoid’) were designed to study physics at the TeV scale, 
and in particular to search for the Higgs boson. In 2011, the LHC delivered 
to ATLAS and CMS up to 5.1 fb~! of integrated luminosity of pp collisions 
at Vs = 7 TeV. In 2012 the CMS energy was increased to 8 TeV, and by July 
2012 up to 5.9 fb~! of further data was delivered. At the LHC, the main Higgs 
boson production processes are the same as at the Tevatron, but as mentioned 
above vector boson fusion is more important than production in association 
with W or Z, or with tt. The LHC experiments are sensitive to Higgs bosons 
of much higher mass than the Tevatron experiments, ranging from the LEP 
bound (22.209) up to about 600 GeV. The same decay channels were studied 
as at the Tevatron. 

By early 2012, the ATLAS and CMS experiments had excluded an my 
value in the interval 129 GeV to 539 GeV at the 95% CL, and the mass 
region 120-130 GeV was under intensive study, excesses of events having been 
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FIGURE 22.33 

Branching ratios of the Higgs boson (figure from R K Ellis, W J Stirling and 
B R Webber QCD and Collider Physics 1996, courtesy Cambridge University 
Press). 


reported by both experiments in the region 124-126 GeV (Aad et al. 2012a, 
Chatrchyan et al. 2012a). Then, on July 4, 2012, the ATLAS and CMS 
collaborations simultaneously announced the observation (at a significance 
greater than 50) of a new boson with a mass in the range 125-126 GeV and 
with properties compatible with those of a SM Higgs boson. These results 
(updated) are reported in Aad et al. (2012b) and Chatrchyan et al. (2012b). 
The crucial channels in the discovery were the decay modes H > yy and 
H > ZZ* > 4 leptons, both of which provide a high-resolution invariant 
mass for fully reconstructed candidates. The cover illustration for Volume 1 
of this book (copyright CERN) shows a candidate yy event recorded by CMS, 
and that for volume 2 (copyright CERN) shows a candidate four muon event 
recorded by ATLAS. The channel H + WW* — €évév is equally sensitive but 
has low resolution. The ATLAS result for the mass of the boson was (Aad et 
al. 2012b) 


126.0 + 0.4(stat) + 0.4(syst.) GeV (22.211) 


and the CMS result was (Chatrchyan et al. 2012b) 


125.3 + 0.4(stat) + 0.5(syst.) GeV. (22.212) 
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At about the same time, the CDF and DO collaborations at the Tevatron 
reported the combined results of their searches for a SM Higgs boson produced 
in association with a W or a Z boson, and subsequently decaying to a bb pair. 
The data corresponded to an integrated luminosity of 9.7 fb~'. An excess of 
events was observed in the mass range 120-135 GeV, at a significance of 30, 
which was interpreted as evidence for a new particle, consistent with the SM 
Higgs boson (Aaltonen et al. 2012). This provided the first strong indication 
for the decay of the new particle to a fermion-antifermion pair at a rate 
consistent with the SM prediction. 


Is the particle discovered by the ATLAS and CMS collaborations the Higgs 
boson of the Standard Model? The decay to two photons implies that its 
spin cannot be unity (Landau 1948, Yang 1950), but spin-0 has not yet been 
established. Even so, this already implies that the particle is different from all 
the other SM particles. The decay modes yy, ZZ*, WW* have been observed 
by ATLAS and CMS, and bb by CDF/DO. The 7+77 mode has not yet been 
seen. A measure of the compatibility of the observed boson with the SM 
Higgs boson is provided by the best-fit value of the common signal strength 
parameter u defined by 


u = 0-BR/(o-BR)sm (22.213) 


where o is the boson production cross section and BR is the branching ratio 
of the boson to the observed final state. ‘SM’ denotes the SM prediction, so 
that the value u = 1 is the SM hypothesis. ATLAS reported a best-fit u-value 
of y = 1.4 + 0.3 for my = 126 GeV; the p-values for the individual channels 
were all within one standard deviation (s.d.) of unity. CMS reported a best-fit 
value of y = 0.87 + 0.23 at my = 125.5 GeV, and again the individual values 
in the observed channels were within 1 s.d. of unity. The conclusion is that 
these results are consistent, within uncertainties, with the predictions for the 
SM Higgs boson. 


We end this book with a discovery which opens a new era in particle 
physics, in which the electroweak symmetry-breaking (Higgs) sector will be 
rigorously tested. The aim will be to measure the couplings of the new boson 
to the other SM particles with increasing accuracy, so as to reveal possible 
deviations from the SM values. The level of precision required to provide 
clear pointers to physics beyond the SM may be very high (see for example 
Gupta et al. 2012). The LHC will continue running until early 2013, when 
it will be shut down for machine improvements needed to allow operation at 
Vs = 14 TeV and higher luminosity; beyond that, the High Luminosity LHC 
is planned to begin data-taking in 2022. However, just as the discovery of the 
W and Z bosons at the CERN pp collider was followed by precision studies 
at the ete” colliders LEP and SLC, a lepton collider is likely to be needed on 
the next stage of this fundamental exploration. 
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A oc ————————— e ooo o ——— 
Problems 


22.1 


(a) Using the representation for a, 3 and ys introduced in section 20.2.2 
(equation (20.14)), massless particles are described by spinors of the 


form 
u = EV? ( iu ) (normalized to ulu = 2E) 
where o : poz = +0+,P = plp|. Find the explicit form of u for the 


case p = (sin 6, 0, cos 0). 
(b) Consider the process Pa +14” — De+e”, discussed in section 22.1, in the 
limit in which all masses are neglected. The amplitude is proportional 
to 
Grua, Ryu (1 — ys Juu, Lju(e”, L)y“(1 — 95)u (ve, R) 


where we have explicitly indicated the appropriate helicities R or L 
(note that, as explained in section 20.2.2, (1 — y5)/2 is the projec- 
tion operator for a right-handed antineutrino). In the CM frame, 
let the initial ~~ momentum be (0,0, E) and the final e” momen- 
tum be E(sin9,0,cos0). Verify that the amplitude is proportional to 
Gp E?(1+cos6@). (Hint: evaluate the ‘easy’ part (54) (1—y5)u(p”) 
first; this will show that the components u = 0, z vanish, so that only 
the u = x,y components of the dot product need to be calculated.) 


22.2 Verify equation (22.20). 


22.3 Check that when the polarization vector of each photon in figures 22.7(a) 
and (b) is replaced by the corresponding photon momentum, the sum of these 
two amplitudes vanishes. 


22.4 By identifying the part of (22.45) which has the form (22.57), derive 
(22.58). 


22.5 Using the vertex (22.48), verify (22.79). 
22.6 Insert (22.29) into (22.151) to derive (22.153). 


22.7 Verify that the neutral current part of (22.159) is diagonal in the ‘mass’ 
basis. 


22.8 Suppose that the Higgs field is a triplet of SU(2) rather than a doublet; 
and suppose that its vacuum value is 


A 0 
(0/g|0) = | 0 
f 
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in the gauge in which it is real. The non-vanishing component has tg = —1, 
using 
1 0 0 
tz=| 0 0 0 
0 0 -1 


in the ‘angular-momentum-like’ basis. Since we want the charge of the vacuum 
to be zero, and we have Q = ts + y/2, we must assign y(@) = 2. So the 
covariant derivative on ¢ is 


(0, + igt - W, + ig’ B,JÓ, 


where A A 
0 Z 0 0 5 0 
u=| = 0 = ran ice, O E 
Meiji x e, Apa o v 
0 FE 0 0 Z 0 


and tz is as above (it is easy to check that these three matrices do satisfy the 
required SU(2) commutation relations [t1,t2] = it). Show that the photon 
and Z fields are still given by (22.36) and (22.37), with the same sin Ow as in 
(22.39), but that now 


Mz = V2Mw / cos Ow. 
What is the value of the parameter p in this model? 
22.9 Use (22.188) to verify (22.190). 
22.10 Calculate the lifetime of the top quark to decay via t => W* + b. 


22.12 Using the Higgs couplings given in appendix Q, verify (22.209) and 
(22.210). 
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Group Theory 


M.1 Definition and simple examples 


A group G is a set of elements (a,b, c,...) with a law for combining any two 
elements a, b so as to form their ordered ‘product’ ab, such that the following 
four conditions hold: 


(i) For every a,b € G, the product ab € G (the symbol ‘€’ means “belongs 
to’, or ‘is a member of”). 


(ii) The law of combination is associative, i.e. 


(ab)c = a(bc). (M.1) 


(iii) G contains a unique identity element, e, such that for all a € G, 


ae = ea =a. (M.2) 


(iv) For all a € G, there is a unique inverse element, a~!, such that 
aap = ata = e. (M.3) 


Note that in general the law of combination is not commutative — i.e. 
ab # ba; if it is commutative, the group is Abelian; if not, it is non-Abelian. 
Any finite set of elements satisfying the conditions (i)—(iv) forms a finite group, 
the order of the group being equal to the number of elements in the set. If 
the set does not have a finite number of elements it is an infinite group. 

As a simple example, the set of four numbers (1, i, -1, —i) form a finite 
Abelian group of order 4, with the law of combination being ordinary multi- 
plication. The reader may check that each of (i)—(iv) is satisfied, with e taken 
to be the number 1, and the inverse being the algebraic reciprocal. A second 
group of order 4 is provided by the matrices 


CRG eae) ae” 


with the combination law being matrix multiplication, ‘e’ being the first (unit) 
matrix, and the inverse being the usual matrix inverse. Although matrix 
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multiplication is not commutative in general, it happens to be so for these 
particular matrices. In fact, the way these four matrices multiply together 
is (as the reader can verify) exactly the same as the way the four numbers 
(1, i, —1, —i) (in that order) do. Further, the correspondence between the 
elements of the two groups is ‘one to one’: that is, if we label the two sets 
of group elements by (e, a,b,c) and (e’, a’, b’,c’), we have the correspondences 
eve,ava,be0', coc. Two groups with the same multiplication 
structure, and with a one-to-one correspondence between their elements, are 
said to be isomorphic. If they have the same multiplication structure but the 
correspondence is not one-to-one, they are homomorphic. 


E —————————————— 
M.2 Lie groups 


We are interested in continuous groups — that is, groups whose elements are 
labelled by a number of continuously variable real parameters Q1,0%2,...,Qy : 
g(a+,02,...,4-) = g(a). In particular, we are concerned with various kinds of 
‘coordinate transformations’ (not necessarily space-time ones, but including 
also ‘internal’ transformations such as those of SU(3)). For example, rotations 
in three dimensions form a group, whose elements are specified by three real 
parameters (e.g. two for defining the axis of the rotation, and one for the angle 
of rotation about that axis). Lorentz transformations also form a group, this 
time with six real parameters (three for 3-D rotations, three for pure velocity 
transformations). The matrices of SU(3) are specified by the values of eight 
real parameters. By convention, parametrizations are arranged in such a 
way that g(0) is the identity element of the group. For a continuous group, 
condition (i) takes the form 


g(a)9(6) = 9(y(@, B)), (M.5) 


where the parameters ~y are continuous functions of the parameters a and 6. 
A more restrictive condition is that y should be an analytic function of a and 
B; if this is the case, the group is a Lie group. 

The analyticity condition implies that if we are given the form of the 
group elements in the neighbourhood of any one element, we can ‘move out’ 
from that neighbourhood to other nearby elements, using the mathematical 
procedure known as ‘analytic continuation’ (essentially, using a power series 
expansion); by repeating the process, we should be able to reach all group 
elements which are ‘continuously connected’ to the original element. The 
simplest group element to consider is the identity, which we shall now denote 
by I. Lie proved that the properties of the elements of a Lie group which can 
be reached continuously from the identity I are determined from elements 
lying in the neighbourhood of T. 
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MES G 


M.3 Generators of Lie groups 
Consider (following Lichtenberg 1970, chapter 5) a group of transformations 
defined by 

x, = fii to pie es EN; Oly Q2,- iii Qr), (M.6) 
where the x,'s (i = 1,2,..., N) are the ‘coordinates’ on which the transfor- 


mations act, and the a’s are the (real) parameters of the transformations. By 
convention, a = 0 is the identity transformation, so 


A transformation in the neighbourhood of the identity is then given by 


r a ; 
dx; = y Edo, (M.8) 
v=1 


where the {da,} are infinitesimal parameters, and the partial derivative is 
understood to be evaluated at the point (æ, 0). 

Consider now the change in a function F(a) under the infinitesimal trans- 
formation (M.8). We have 


F 
FoF+dF = LE da; 
; Ti 


Ill 
E: 
| 
a 
Q 
Ss 
> 
Sa 
k> 
E 
$ 


where 


N 
ăi Of (M.10) 


is a generator of infinitesimal transformations!. Note that in (M.10) v runs 
from 1 to r, so there are as many generators as there are parameters labelling 
the group elements. Finite transformations are obtained by ‘exponentiating’ 
the quantity in braces in (M.9) (compare (12.30)): 


U(a) = exp{—ia- X}, (M.11) 


1Clearly there is lot of ‘convention’ (the sign, the i) in the definition of Šv. It is chosen 
for convenient consistency with familiar generators, for example those of SO(3) (see section 
M.4.1). 
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where we have written )>,_, aX, =a- X. 
An important theorem states that the commutator of any two generators 
of a Lie group is a linear combination of the generators: 


Ra, Ru] = Kpr, (M.12) 


where the constants că are complex numbers called the structure constants 
of the group; a sum over v from 1 to r is understood on the right-hand side. 
The commutation relations (M.12) are called the algebra of the group. 


= 
M.4 Examples 
M.4.1 SO(3) and three-dimensional rotations 
Rotations in three dimensions are defined by 
x' = Rg, (M.13) 


where R is a real 3 x 3 matrix such that the length of x is preserved, i.e. 
ax! = aT x. This implies that RTR = I, so that R is an orthogonal matrix. 
It follows that 


1 = det( RTR) = det RTdetR = (det Ry?, (M.14) 


and so detR = +1. Those R’s with detR = —1 include a parity transformation 
(a! = —a), which is not continuously connected to the identity. Those with 
detR = 1 are ‘proper rotations’, and they form the elements of the group 
SO(3): the Special Orthogonal group in 3 dimensions. 

An R close to the identity matrix J can be written as R = I + ôR where 


(IT +8R)"(I +R) =T. (M.15) 
Expanding this out to first order in R gives 
RT = —ôR, (M.16) 


so that ôR is an antisymmetric 3 x 3 matrix (compare (12.19)). We may 
parametrize ôR as 


0 €3 —€2 
ôR = —€3 0 €1 ; (M.17) 
€2 61 0 


and an infinitesimal rotation is then given by 


T=L-EXL, (M.18) 
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(compare (12.64)), or 
dx, = —€2T3 + €3%2, dro = —€3X1 + €1T3, dx3 = —€1T2 + E21. (M.19) 


Thus in (M.8), identifying don = €1, daz = €2, dag = ez, we have 


Ofi _ Of. _ Of, _ 
TA =0, TA T3, a; = £2, etc. (M.20) 


The generators (M.10) are then 


% 3 (e) : ð 

Xı = 123 Drz — 122 Bes 

Xo = ÎTI gz == 123 Da (M.21) 
Q i (2) z (e) 

X3 = 122 Ər, =, 101 Ərz 


which are easily recognized as the quantum-mechanical angular momentum 
operators ` 
X=2~x -iV, (M.22) 


which satisfy the SO(3) algebra 
Xi, X;] = ierj Xk- (M.23) 


The action of finite rotations, parametrized by a = (a1, «2, a3), on functions 
F is given by i , 

U(a) = expí—ia- X}. (M.24) 
The operators Û (aœ) form a group which is isomorphic to SO(3). The structure 
constants of SO(3) are icijk, from (M.23). 


M.4.2 SU(2) 


We write the infinitesimal SU(2) transformation (acting on a general complex 
two-component column vector) as (cf (12.27)) 


( d ) = (1+ie-7/2)( a 2 (M.25) 


do q2 
so that 
ic ic] €2 
dy = za + ($ + 2) q2 
—ie€3 ic, €2 
= —=-— : M.2 
dq2 2 q2 + ( 2 J ) qı ( 6) 


Then (with da; = ea etc.) 
ig Of: @ Of: _ in 


Of, 
ev A = WS a 2 
dan 2” Jaz 2’ Oas 2” (M:24) 
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Of. iq Of: q Of ig 


Re oe D=, (M.28) 
and (from (M.10)) 

==> {age tag) (M.29) 

Å, = Í fo - ape} (M.30) 

X= 3 (009, tog}, (M.31) 


It is an interesting exercise to check that the commutation relations of the 
X''s are exactly the same as those of the X;’s in (M.23). The two groups are 
therefore said to have the same algebra, with the same structure constants, 
and they are in fact isomorphic in the vicinity of their respective identity 
elements. They are not the same for ‘large’ transformations, however, as we 
discuss in section M.7. 


M.4.3 SO(4): The special orthogonal group in four 
dimensions 


This is the group whose elements are 4 x 4 matrices S such that STS = J, 
where I is the 4x 4 unit matrix, with the condition detS = +1. The Euclidean 
(length)? x? + 23 + v3 + ză is left invariant under SO(4) transformations. 
Infinitesimal SO(4) transformations are characterized by the 4-D analogue 
of those for SO(3), namely by 4 x 4 real antisymmetric matrices ôS, which 
have 6 real parameters. We choose to parametrize 6S in such a way that the 
Euclidean 4-vector (æ, x4) is transformed to (cf (18.76) and (18.77)) 


av = L-EXLMNMZA, 


vr, = 24+N:2, (M.32) 
where x = (#1, 22,23) and n = (m, 72,73). Note that the first three compo- 
nents transform by (M.18) when y = 0, so that SO(3) is a subgroup of SO(4). 
The six generators are (with da, = €, etc.) 


Xe ee 2 


— 129 —— M. 
Dans a (M.33) 


and similarly for Xj and X3 as in (M.21), together with (defining da4 = m 
etc.) 


o A O 9 
X4 = 1 (072 T T1 Z) (M.34) 
> Ă O 9 
Xs = 1 (0172 se noes) (M.35) 
Xe = i (0172 a naz) . (M.36) 
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Relabelling these last three generators as Y= Š, y= xX yy eS Še, we 
find the following algebra: 


[Xi, Xj] = icijn Xr (M.37) 
[Xi, Yj] = ieign Ye (M.38) 
wr] = ici Xk, (M.39) 
together with ae a.” A PE 
[X1, Yi] = [Xa, Ya] = [X3, Ys] = 0. (M.40) 


(M.37) confirms that the three generators controlling infinitesimal transfor- 
mations among the first three components x obey the angular momentum 
commutation relations. (M.37)-(M.40) constitute the algebra of SO(4). 

This algebra may be simplified by introducing the linear combinations 


Toa 

M = ght i) (M.41) 

A Tero g 
Ni = z% Y) (M.42) 

which satisfy tat i 
[Mi, Mj] = ici Mk (M.43) 
[Ni, Nj] = ieir Na (M.44) 
[M;, Nj] = 0. (M.45) 


From (M.43)-(M.45) we see that, in this form, the six generators have sep- 
arated into two sets of three, each set obeying the algebra of SO(3) (or of 
SU(2)), and commuting with the other set. They therefore behave like two 
independent angular momentum operators. The algebra (M.43)-(M.45) is re- 
ferred to as SU(2) x SU(2). 


M.4.4 The Lorentz group 


In this case the quadratic form left invariant by the transformation is the 
Minkowskian one (x°)? — a? (see appendix D of volume 1). We may think of 
infinitesimal Lorentz transformations as corresponding physically to ordinary 
infinitesimal 3-D rotations, together with infinitesimal pure velocity transfor- 
mations (‘boosts’). The basic 4-vector then transforms by 


Oo 28 ROM st 
ae ARI sită \ (M.46) 


a = g-exa2—7x° 
where y is now the infinitesimal velocity parameter (the reader may check 
that (x°)? — æ? is indeed left invariant by (M.46), to first order in e and n). 
The six generators are then X1, X2, X3 as in (M.21), together with 


a 9 o 
rn ul 20 fe BOE 
Kı =-i (e 790 +2 Z) (M.47) 
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Ky = -i (25 + 75) (M.48) 
K3=-i (os + 133) (M.49) 
The corresponding algebra is 
[X:, Xy] = ici Xb (M.50) 
[Xi, Kj] = ieign Ka (M.51) 
[Ki, Kj] = ici Xe. (M.52) 


Note the minus sign on the right-hand side of (M.52) as compared with (M.39). 


M.4.5 SU(3) 


A general infinitesimal SU(3) transformation may be written as (cf (12.71) 
and (12.72)) 


7 


q1 1 qı 
43 43 
where there are now 8 of these ms, n = (n1,n2,...,9s), and the A-matrices 


are the Gell-Mann matrices 


0 1 0 0 —i 0 1 0 0 
M=| 1 00],A2={ i 0 0O},rA3=] 0 -1 0 | M54 
0 0 0 0 0 0 0 0 
0 0 1 0 0 =i 00 0 
d=| 000 )],r4=[0 0 0 ],åė=ļ|001 (M.55) 
10 0 0 0 0 1 0 
00 0 R 0 0 
wyw={ 00 -i],r92=| 0 5 0 (M.56) 
; 2 
0 i 0 0 0 -= 


In this parametrization the first three of the eight generators G, (r =1,2,...,8) 
are the same as Xj, X3, X% of (M.29)-(M.30). The others may be constructed 
as usual from (M.10); for example, 


A i 9 O A i O O 
Di i | a -qa | RR Y d -p | M.57 
573 (ss an qı Z) 7553 (e Ig q2 Z) ( ) 
The SU(3) algebra is found to be 


[Ga, Go E ias (M.58) 
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where a,b and c each run from 1 to 8. The structure constants are i fac, and 
the non-vanishing f’s are as follows: 


fi23 = 1, fiar = 1/2, fise = —1/2, fz46 = 1/2, f257 = 1/2 (M.59) 


feas = 1/2, fer = —1/2, fass = V3/2, fers = V3/2. (M.60) 


Note that the f’s are antisymmetric in all pairs of indices (Carruthers (1966) 
chapter 2). 


M.5 Matrix representations of generators, and of Lie 
groups 


We have shown how the generators Kis Žo, wes pa of a Lie group can be con- 
structed as differential operators, understood to be acting on functions of the 
‘coordinates’ to which the transformations of the group refer. These genera- 
tors satisfy certain commutation relations, the Lie algebra of the group. For 
any given Lie algebra, it is also possible to find sets of matrices X1, X2,..., Xr 
(without hats) which satisfy the same commutation relations as the X,'s — 
that is, they have the same algebra. Such matrices are said to form a (ma- 
trix) representation of the Lie algebra, or equivalently of the generators. The 
idea is familiar from the study of angular momentum in quantum mechanics 
(Schiff 1968, section 27), where the entire theory may be developed from the 
commutation relations (with h = 1) 


[is Jj] = icije de (M.61) 


for the angular momentum operators J! together with the physical require- 
ment that the J;'s (and the matrices representing them) must be Hermitian. 
In this case the matrices are of the form (in quantum-mechanical notation) 


(J) = IA 
(2 e = (IMI My, (M.62) 


a 


where |J My) is an eigenstate of J and of J3 with eigenvalues J(J + 1) and 
M; respectively. Since My and M’, each run over the 2.J +1 values defined by 
—J < My, M} < J, the matrices IS? are of dimension (2J + 1) x (2J + 1). 
Clearly, since the generators of SU(2) have the same algebra as (M.61), an 
identical matrix representation may be obtained for them; these matrices were 
denoted by qua in section 12.1.2. It is important to note that J (or T) 
can take an infinite sequence of values J = 0,1/2,1,3/2,..., corresponding 
physically to various ‘spin’ magnitudes. Thus there are infinitely many sets 


J) I (4) 


of three matrices (J£ all with the same commutation relations as 


(M.61). 
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A similar method for obtaining matrix representations of Lie algebras may 
be followed in other cases. In physical terms, the problem amounts to finding 
a correct labelling of the base states, analogous to |JM). In the latter case, 
the quantum number J specifies each different representation. The reason 


x2 
it does so is because (as should be familiar) the corresponding operator J 
commutes with every generator: 


EPA (M.63) 


Such an operator is called a Casimir operator, and by a lemma due to Schur 
(Hammermesh 1962, pages 100-101) it must be a multiple of the unit operator. 
The numerical value it has is different for each different representation, and 
may therefore be used to characterize a representation (namely as ‘J = 0’, 
‘J = 1/2’, etc.). 

In general, more than one such operator is needed to characterize a repre- 
sentation completely. For example, in SO(4), the two operators M and Ñ É 
commute with all the generators, and take values M(M +1) and N(N +1) 
respectively, where M,N = 0,1/2,1,.... Thus the labelling of the matrix 
elements of the generators is the same as it would be for two independent par- 
ticles, one of spin M and the other of spin N. For given M,N the matrices 
are of dimension [(2M + 1)+ (2N +1)] x [(2M+1)+ (2N + 1)]. The number of 
Casimir operators required to characterize a representation is called the rank 
of the group (or the algebra). This is also equal to the number of independent 
mutually commuting generators (though this is by no means obvious). Thus 
SO(4) is a rank two group, with two commuting generators M3 and Ñ; so 
is SU(3), since G3 and Gg commute. Two Casimir operators are therefore 
required to characterize the representations of SU(3), which may be taken to 
be the ‘quadratic’ one 


CSC re bs ee, (M.64) 
together with a ‘cubic’ one 
Cs = ARCE, (M.65) 


where the coefficients day. are defined by the relation 
4 
(Aa Ab} = gabl + 2dareAc, (M.66) 


and are symmetric in all pairs of indices (they are tabulated in Carruthers 
1966, table 2.1). In practice, for the few SU(3) representations that are ac- 
tually required, it is more common to denote them (as we have in the text) 
by their dimensionality, which for the cases 1 (singlet), 3 (triplet), 3* (an- 
titriplet), 8 (octet) and 10 (decuplet) is in fact a unique labelling. The values 
of Ca in these representations are 


C2(1) =0, C2(3, 3*) = 4/3, C2(8) = 3, C2(10) = 6. (M.67) 
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Having characterized a given representation by the eigenvalues of the 
Casimir operator(s), a further labelling is then required to characterize the 
states within a given representation (the analogue of the eigenvalue of Jz for 
angular momentum). For SO(4) these further labels may be taken to be the 
eigenvalues of Mz and Na; for SU(3) they are the eigenvalues of (iad Ĝs — 
i.e. those corresponding to the third component of isospin and hypercharge, 
in the flavour case (see figures 12.3 and 12.4). 

In the case of groups whose elements are themselves matrices, such as 
SO(3), SO(4), SU(2), SU(3), and the Lorentz group, one particular represen- 
tation of the generators may always be obtained by considering the general 
form of a matrix in the group which is infinitesimally close to the unit element. 
In a suitable parametrization, we may write such a matrix as 


141) eX, (M.68) 
v=1 
where (€1,€2,...,€r) are infinitesimal parameters, and (XO, x), ae axe?) 


are matrices representing the generators of the (matrix) group G. This is 
exactly the same procedure we followed for SU(2) in section 12.1.1, where 
we found from (12.26) that the three XTS were just 7/2, satisfying the 
SU(2) algebra. Similarly, in section 12.2 we saw that the eight SU(3) XP 
were just A/2, satisfying the SU(3) algebra. These particular two represen- 
tations are called the fundamental representations of the SU(2) and SU(3) 
algebras, respectively; they are the representations of lowest dimensionality. 
For SO(3), the three Xx Los are (from (M.17)) 


0 0 0 

XEON = |0 0 -i 

0 i 0 

0 0 i 

xo) = |0 00 

—i 0 0 

0 -i 0 
xeo = | i oo (M.69) 

0 0 0 

which are the same as the 3 x 3 matrices TO of (12.48): 
(74) Si (M.70) 
jk 


The matrices 7;/2 and q correspond to the values J = 1/2, J = 1, respec- 
tively, in angular momentum terms. 

It is not a coincidence that the coefficients on the right-hand side of (M.70) 
are (minus) the SO(3) structure constants. One representation of a Lie algebra 
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is always provided by a set of matrices {X m) whose elements are defined by 
(XM) =- (M.71) 

pv 


where the c’s are the structure constants of (M.12), and each of u,v, A runs 
from 1 to r. Thus these matrices are of dimensionality r x r, where r is the 
number of generators. That this prescription works is due to the fact that the 
generators satisfy the Jacobi identity 


A e Ro, R] + r, A pisi (M.72) 


Using (M.12) to evaluate the commutators, and the fact that the generators 
are independent, we obtain 


Ce + Cres + e =0. (M.73) 


The reader may fill in the steps leading from here to the desired result: 


(e), (ag CP) P), 0070 


(M.74) is precisely the (v8) matrix element of 


Pa, (M.75) 
showing that the X Pog satisfy the group algebra (M.12), as required. The 
representation in which the generators are represented by (minus) the struc- 
ture constants, in the sense of (M.71), is called the regular or adjoint repre- 
sentation. 

Having obtained any particular matrix representation X (P) of the genera- 
tors of a group G, a corresponding matrix representation of the group elements 
can be obtained by exponentiation, via 


DO (a) =explia- XO}, (M.76) 


where a = (a1,0%2,...,0r) (see (12.31) and (12.49) for SU(2), and (12.74) 
and (12.81) for SU(3)). In the case of the groups whose elements are matrices, 
exponentiating the generators X (8) just recreates the general matrices of the 
group, so we may call this the ‘self-representation’: the one in which the group 
elements are represented by themselves. In the more general case (M.76), the 
crucial property of the matrices D(P(a) is that they obey the same group 
combination law as the elements of the group G they are representing: that 
is, if the group elements obey 


gla)g(B) = gira, B)), (M.77) 


P) 


then 

D® (a) D™ (8) = DP y(a, B)). (M.78) 
It is a rather remarkable fact that there are certain, say, 10 x 10 matrices 
which multiply together in exactly the same way as the rotation matrices of 


SO(3). 
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AAA oo... oo— 


M.6 The Lorentz group 


Consideration of matrix representations of the Lorentz group provides insight 
into the equations of relativistic quantum mechanics, for example the Dirac 
equation. Consider the infinitesimal Lorentz transformation (M.46). The 4x4 
matrix corresponding to this may be written in the form 


1+ie XO iz RS), (M.79) 
where 
000 0 
XS) = : î : i ete, (M.80) 
00 i 0 


(as in (M.69) but with an extra border of 0’s), and 


o 
| 
pl 


Keo E 71 


o 
ooo 
[ao ao Sa Q 


—i 


O OO 9 0000 


(M.81) 


oooo 0000 


0 
0 
0 
0 — 
0 
0 
0 


In (M.80) and (M.81) the matrices are understood to be acting on the four- 
component vector 


0 
1 
sl (M.82) 
3 


It is straightforward to check that the matrices KS) and RUS) satisfy the 
algebra (M.50)-(M.52) as expected. 

An important point to note is that the matrices KUO 
De or eC). and to the corresponding matrices of SU(2) and SU(3), 
are not Hermitian. A theorem states that only the generators of compact 
Lie groups can be represented by finite-dimensional Hermitian matrices. Here 
‘compact’ means that the domain of variation of all the parameters is bounded 
(none exceeds a given positive number p in absolute magnitude) and closed 


, in contrast to 
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(the limit of every convergent sequence of points in the set also lies in the set). 
For the Lorentz group, the limiting velocity c is not included (the y-factor goes 
to infinity), and so the group is non-compact. 

In a general representation of the Lorentz group, the generators X;, K; 
will obey the algebra (M.50)-(M.52). Let us introduce the combinations 


P= Lx +iK) (M.83) 
= ix — iK). (M.84) 
Then the algebra becomes 
[P;, Pj] = 1655 Pk (M.85) 
Qi, Qj] = i€ijnQe (M.86) 
[Pi Q;] = 0, (M.87) 


which are apparently the same as (M.43)-(M.45). We can see from (M.81) 
that the matrices iK “© are Hermitian, and the same is in fact true in a 
general finite-dimensional representation. So we can appropriate standard 
angular momentum theory to set up the representations of the algebra of 
the P’s and Q’s — namely, they behave just like two independent (mutually 
commuting) angular momenta. The eigenvalues of P? are of the form P(P+1), 
for P = 0,1/2,..., and similarly for Q?; the eigenvalues of P are Mp where 
—P < Mp < P, and similarly for Q3. 

Consider the particular case where the eigenvalue of Q? is zero (Q = 
0), and the value of P is 1/2. The first condition implies that the Q’s are 
identically zero, so that 

X=iK (M.88) 


in this representation, while the second condition tells us that 
1 . 1 


the familiar matrices for spin-1/2. We label this representation by the values 
of P (1/2) and Q (0) (these are the eigenvalues of the two Casimir operators). 
Then using (M.88) and (M.89) we find 


XO = 5° (M.90) 


and E 
KO = -50 (M.91) 
Now recall that the general infinitesimal Lorentz transformation has the 
form 


l+ie-X —in- K. (M.92) 
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In the present case this becomes 
l+ie-0/2-9:0/2. (M.93) 


These matrices are of dimension 2 x 2, and act on two-component spinors, 
which therefore transform under an infinitesimal Lorentz transformation by 
(cf (4.19) and (4.42)) 


Y =(1+ie-0/2- 90/20. (M.94) 


We say that ¢ ‘transforms as the (1/2, 0) representation of the Lorentz group’. 
The ‘1+ie-o/2’ part is the familiar (infinitesimal) rotation matrix for spinors, 
first met in section 4.4; it exponentiates to give exp(ia - 0/2) for finite rota- 
tions. The '—n: 0/2” part shows how such a spinor transforms under a pure 
(infinitesimal) velocity transformation. The finite transformation law is 


p' = exp(—9.-0/2)9 (M.95) 


where the three real parameters 9 = (91,92,93) specify the direction and 
magnitude of the boost. 

There is, however, a second two-dimensional representation, which is char- 
acterized by the labelling P = 0,Q = 1/2, which we denote by (0, 1/2). In 
this case, the previous steps yield 


1 
x@9 = 57 (M.96) 
as before, but : 
KOH — 50 (M.97) 


So the corresponding two-component spinor x transforms by (cf (4.19) and 
(4.42)) 
X =(+ie-0o/2+n-0/2)x. (M.98) 
We see that ¢ and x behave the same under rotations, but ‘oppositely’ under 
boosts. 
These transformation laws are exactly what we used in section 4.1.2 when 
discussing the behaviour of the Dirac wavefunction y under Lorentz transfor- 
mations, where w is put together from one Y and one x via 


de ( 4 | (M.99) 


and describes a massive spin-1/2 particle according to the equations 


E¢ = o : pọ + mx 
Ex = —øo - px + mọ, (M.100) 


consistent with the representation (3.40) of the Dirac matrices. 
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M.7 The relation between SU(2) and SO(3) 


We have seen (sections M.4.1 and M.4.2) that the algebras of these two groups 
are identical. So the groups are isomorphic in the vicinity of their respective 
identity elements. Furthermore, matrix representations of one algebra auto- 
matically provide representations of the other. Since exponentiating these 
infinitesimal matrix transformations produces matrices representing group 
elements corresponding to finite transformations in both cases, it might ap- 
pear that the groups are fully isomorphic. But actually they are not, as we 
shall now discuss. 

We begin by re-considering the parameters used to characterize elements 
of SO(3) and SU(2). A general 3-D rotation is described by the SO(3) matrix 
R(n, 0), where 7 is the axis of the rotation and 6 is the angle of rotation. For 
example, 

cos@ sind 0 
R(2,0)= | —sin cos? 0 |. (M.101) 
0 0 1 


On the other hand, we can write the general SU(2) matrix V in the form 
a b 
V= ( se de ) (M.102) 


where |a|?+|b|? = 1 from the unit determinant condition. It therefore depends 
on three real parameters, the choice of which we are now going to examine 
in more detail than previously. In (12.32) we wrote V as exp(ia - 7/2), 
which certainly involves three real parameters a1, @2,a3; and below (12.35) 
we proposed, further, to write a = né, where @ is an angle and ñ is a unit 
vector. Then, since (as the reader may verify) 


exp(i0r - 2/2) = cos 0/2 + iT -nsin 0/2, (M.103) 
it follows that this latter parametrization corresponds to writing, in (M.102), 
a = cos0/2+ in; sin 0/2, b= (ny +inz) sin 0/2, (M.104) 


with n2 +n? + n2 = 1. Clearly the condition |a|? + |b|? = 1 is satisfied, and 
one can convince oneself that the full range of a and b is covered if 0/2 lies 
between 0 and z (in particular, it is not necessary to extend the range of 6/2 
so as to include the interval m to 27, since the corresponding region of a,b can 
be covered by changing the orientation of n, which has not been constrained 
in any way). It follows that the parameters a satisfy a? < 41?; that is, the 
space of the a’s is the interior, and surface, of a sphere of radius 27, as shown 
in figure M.1. 

What about the parameter space of SO(3)? In this case, the same param- 
eters n and 6 specify a rotation, but now 0 (rather than 0/2) runs from 0 to 7. 
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FIGURE M.1 
The parameter spaces of SO(3) and SU(2): the whole sphere is the parameter 
space of SU(2), the upper (stippled) hemisphere that of SO(3). 


However, we may allow the range of 0 to extend to 27, by taking advantage 
of the fact that 
R(ñ,7 + 0) = R(-ñ, 0). (M.105) 


Thus if we agree to limit to directions in the upper hemisphere of figure M.1, 
for 3-D rotations, we can say that the whole sphere represents the parameter 
space of SU(2), but that of SO(3) is provided by the upper half only. 

Now let us consider the correspondence — or mapping — between the ma- 
trices of SO(3) and SU(2): we want to see if it is one-to-one. The notation 
strongly suggests that the matrix V(ñ,0) = exp(i9ñ - 7/2) of SU(2) corre- 
sponds to the matrix R(ñ, 0) of SO(3), but the way it actually works has a 
subtlety. 

We form the quantity æ - T, and assert that 


a .Tr=V(ñ,0) 2-7 Vi(ñ,0), (M.106) 


where x’ = R(ñ, O)x. We can easily verify (M.106) for the special case R(2, 0), 
using (M.101); the general case follows with more labour (but the general 
infinitesimal case should by now be a familiar manipulation). (M.106) estab- 
lishes a precise mapping between the elements of SU(2) and those of SO(3), 
but it is not one-to-one (i.e. not an isomorphism), since plainly V can always 
be replaced by —V and a’ will be unchanged, and hence so will the associated 
SO(3) matrix R(ñ, 0). It is therefore a homomorphism. 

Next, we prove a little theorem to the effect that the identity element e 
of a group G must be represented by the unit matrix of the representation: 
D(e) = I. For, let D(a), D(e) represent the elements a, e of G. Then D(ae) = 
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D(a)D(e) by the fundamental property (M.78) of representation matrices. On 
the other hand, ae = a by the property of e. So we have D(a) = D(a)D(e), 
and hence D(e) = I. 

Now let us return to the correspondence between SU(2) and SO(3). V (ñ, 0) 
corresponds to R(ñ, 0), but can an SU(2) matrix be said to provide a valid 
representation of SO(3)? Consider the case V (ñ = 2,0 = 27). From (M.103) 


this is equal to 
-1 0 
o 1) (M.107) 


but the corresponding rotation matrix, from (M.101), is the identity matrix. 
Hence our theorem is violated, since (M.107) is plainly not the identity matrix 
of SU(2). Thus the SU(2) matrices can not be said to represent rotations, in 
the strict sense. Nevertheless, spin-1/2 particles certainly do exist, so Nature 
appears to make use of these ‘not quite’ representations! The SU(2) identity 
element is V(ñ = 2,0 = 47), confirming that the rotational properties of a 
spinor are quite other than those of a classical object. 
In fact, two and only two distinct elements of SU(2), namely 


é i) and (E he je (M.108) 


correspond to the identity element of SO(3) in the correspondence (M.106) — 
just as, in general, V and —V correspond to the same SO(3) element R(ñ, 0), 
as we saw. The failure to be a true representation is localized simply to a sign: 
we may indeed say that, up to a sign, SU(2) matrices provide a representation 
of SO(3). If we ‘factor out’ this sign, the groups are isomorphic. A more 
mathematically precise way of saying this is given in Jones (1990, chapter 8). 


N 


Geometrical Aspects of Gauge Fields 


N.1 Covariant derivatives and coordinate 
transformations 


Let us go back to the U(1) case, equations (13.4)-(13.7). There, the intro- 
duction of the (gauge) covariant derivative D” produced an object, D'wp(x), 
which transformed like y(x) under local U(1) phase transformations, unlike 
the ordinary derivative 0“w(ax) which acquired an ‘extra’ piece when trans- 
formed. This followed from simple calculus, of course — but there is a slightly 
different way of thinking about it. The derivative involves not only (x) at 
the point x, but also w at the infinitesimally close, but different, point x + dz; 
and the transformation law of y(x) involves a(x), while that of y(x + dz) 
would involve the different function a(x + da). Thus we may perhaps expect 
something to ‘go wrong’ with the transformation law for the gradient. 

To bring out the geometrical analogy we are seeking, let us write y = 
Vr + ivy and a(x) = qx(x) so that (13.3) becomes (cf (2.64)) 


verle) = cosa(r)WR(x) — sina(x)y1(x) 
(N.1) 
vila) = sina(x)WR(x) + cosa(x)y1(z). 


If we think of Yr (2) and 41(x) as being the components of a ‘vector’ (2) along 
the ëR and €] axes, respectively, then (N.1) would represent the components 
of (a) as referred to new axes en and €j', which have been rotated by —a(x) 
about an axis in the direction ëR x Ei (i.e. normal to the £f —€í plane), as shown 
in figure N.1. Other such ‘vectors’ $ (2), 62(a),... (ie. other wavefunctions 
for particles of the same charge q) when evaluated at the same point x will 
have ‘components’ transforming the same as (N.1) under the axis rotation 
ER, EI > ER, €. But the components of the vector Va + dx) will behave 
differently. The transformation law (N.1) when written at x + dz will involve 
a(1+dx), which (to first order in dx) is a(x)+0,a(x)da". Thus for Wp (1+dx) 
and vVi(z + dz) the rotation angle is a(x) + 3 a(x)dz” rather than a(x). 
Now comes the key step in the analogy: we may think of the additional 
angle 0,a(z)dz" as coming about because, in going from x to x + dz, the 
coordinate basis vectors ĉr and er have been rotated through +0,0(x)de" 
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FIGURE N.1 
Geometrical analogy for a U(1) gauge transformation. 


(see figure N.2)! But that would mean that our ‘naive’ approach to rotations 
of the derivative of u(x) amounts to using one set of axes at x, and another 
at x + dz, which is likely to lead to ‘trouble’. Consider now an elementary 
example (from Schutz 1988, chapter 5) where just this kind of problem arises, 
namely the use of polar coordinate basis vectors €, and €g, which point in the 
r and @ directions respectively. We have, as usual, 


x=rcosé, y=rsing (N.2) 
and in a (real!) Cartesian basis d is given by 
df = dx i+ dy J. (N.3) 
Using (N.2) in (N.3) we find 
dr = (drcosé—rsiné dð) + (drsin0 +rcosé d0)j 
= dr €, +d0 & (N.4) 
where ES 3 
€, = cos ĝi + sinb j,é = —r sin ĝi +r cosð j. (N.5) 


Plainly, €, and € change direction (and even magnitude, for €4) as we move 
about in the x — y plane, as shown in figure N.2. So at each point (r,0) we 
have different axes €r, Eg. 
Now suppose that we wish to describe a vector field V in terms of €, and 
Eo via A 
V = Vre, + Ve = V“e, (sum on a =r,0), (N.6) 
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FIGURE N.2 
Changes in the basis vectors €, and €g of polar coordinates. 


and that we are also interested in the derivatives of V, in this basis. Let us 
calculate X, for example, by brute force: 
Əv? OE, 


av av" be ee g Èo 
a pent p et a ae (N.7) 


=. 


where we have included the derivatives of €, and eg to allow for the fact that 
these vectors are not constant. From (N.5) we easily find 


Oe. Ey 


> > 1 
a G or = —sini + cos j = 9, (N.8) 


which allows the last two terms in (N.7) to be evaluted. Similarly, we can 
calculate z. In general, we may write these results as 


OV ov, a dea 
where 6 = 1,2 with q! =r,q2 =0,anda=r,. 

In the present case, we were able to calculate 98, /0q% explicitly from 
(N.5), as in (N.8). But whatever the nature of the coordinate system, 8x / 3q? 
is some vector and must be expressible as a linear combination of the basis 
vectors via an expression of the form 


Ola, 


Age asis (N.10) 


where the repeated index y is summed over as usual (y = r,0). Inserting 
(N.10) into (N.9) and interchanging the ‘dummy’ (i.e. summed over) indices 
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a and y gives finally 


av Ve a 7 


This is a very important result: it shows that, whereas the components of V in 
the basis € are just V%, the components of the derivative of V are not simply 
0V*/9q%, but contain an additional term: the ‘components of the derivative 
of a vector’ are not just the ‘derivatives of the components of the vector’. 
Let us abbreviate 0/0q° to Og; then (N.11) tells us that in the Za basis, 


as used in (N.11), the components of the 0g derivative of V are 
OV +T? “ov = Dave. (N.12) 


The expression (N.12) is called the ‘covariant derivative’ of V° within the 
context of the mathematics of general coordinate systems: it is denoted (as 

n (N.12)) by DgV* or, often, by V%, (in the latter notation, OV“ is V%,). 
The most important property of DaVa is its transformation character ime 
general coordinate transformations. Crucially, it transforms as a tensor TẸ 
(see appendix D of volume 1) with the indicated ‘one up, one down’ indices; 
we shall not prove this here, referring instead to Schutz (1988), for example. 
This property is the reason for the name ‘covariant derivative’, meaning in 
this case essentially that it transforms the way its indices would have you 
believe it should. By contrast, and despite appearances, 0gV° by eel does 
not transform as a ‘TẸ? tensor, and in a similar way T°}, ¿ is not a T ¿type 
tensor; only the combined object DgV* is a 15. 

This circumstance is highly reminiscent of de situation we found in the 
case of gauge transformations. Consider the simplest case, that of U(1), for 
which DY = ô Y + iq 4,4. The quantity D,y transforms under a gauge 
transformation in the same way as Y itself, but 0,,y does not. There is thus 
a close analogy between the ‘good’ transformation properties of DgV° and of 
D,w. Further, the structure of D „y is very similar to that of DgV°. There 
are two pieces, the first of which is the straightforward derivative, while the 
second involves a new field (T or A) and is also proportional to the original 
field. The ‘i’ of course is a big difference, showing that in the gauge symmetry 
case the transformations mix the real and imaginary parts of the wavefunction, 
rather than actual spatial components of a vector. 

Indeed, the analogy is even closer in the non-Abelian — e.g. local SU(2) 
- case. As we have seen, 04 (2) does not transform as an SU(2) isospinor 
because of the extra piece involving Oe; nor do the gauge fields W” transform 
as pure T = 1 states, also because of a OMe term. But the gauge covariant 
combination (01 +igr-W"/2)4(3) does transform as an isospinor under local 
SU(2) transformations, the two ‘extra’ Oe pieces cancelling each other out. 

There is a useful way of thinking about the two contributions to DgV® 
(or DP). Let us multiply (N.12) by dq? and sum over £ so as to obtain 


DV* = OgV“dq" +T%,gV7dq". (N.13) 
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FIGURE N.3 N 
Parallel transport of a vector V in a polar coordinate basis. 


The first term on the right-hand side of (N.13) is dq? which is just the 
conventional differential dV“, representing the change in V° in moving from 
a? to qf +dq?: dve = [V€ (q8 + dq’) —- V*(q%)]. Again, despite appearances, 
the quantities dV“ do not form the components of a vector, and the reason 
is that V°(q? + dq’) are components with respect to axes at q + dq’, while 
V°(q°) are components with respect to different axes at q°. To form a ‘good’ 
differential DV“, transforming as a vector, we must subtract quantities de- 
fined in the same coordinate system. This means that we need some way of 
‘carrying’ V°(q?) to qê +dq®, while keeping it somehow ‘the same’ as it was 
at q? A 

A reasonable definition of such a ‘preserved’ vector field is one that is 
unchanged in length, and has the same orientation relative to the axes at 
q? + dq? as it had relative to the axes at q% (see figure N.3). In other words, 
V is ‘dragged around’ with the changing coordinate frame, a process called 
parallel transport. Such a definition of ‘no change’ of course implies that 
change has occurred, in general, with respect to the original axes at q°. Let 
us denote by dV“ the difference between the components of V after parallel 
transport to q + dqf, and the components of V at q% (see figure N.3). Then a 
reasonable definition of the ‘good’ differential of V“ would be V%(q? + dq?) — 
(Ve(q5) + 6V%) = dV% — 6V%. We interpret this as the covariant differential 
DV of (N.13), and accordingly, make the identification 


6V% = TI" ¿V "del. (N.14) 


On this interpretation, then, the coefficients T° connect the components of 
a vector at one point with its components at a nearby point, after the vector 
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has been carried by ‘parallel transport’ from one point to the other; they are 
often called ‘connection coefficients’, or just ‘the connection’. 
In an analogous way we can write, in the U(1) gauge case, 


Dy = D'dbdz, = Adda, +ieA” ydr, 
= dy — y (N.15) 
with 
dy = —ie Ada. (N.16) 


Equation (N.16) has a very similar structure to (N.14), suggesting that the 
electromagnetic potential A” might well be referred to as a ‘gauge connection”, 
as indeed it is in some quarters. Equations (N.15) and (N.16) generalize 
straightforwardly for Dy) and ôy). 

We can relate (N.16) in a very satisfactory way to our original discussion of 
electromagnetism as a gauge theory in chapter 2, and in particular to (2.83). 
For transport restricted to the three spatial directions, (N.16) reduces to 


y(x) = ie A. dæy(z). (N.17) 


However, the solution (2.83) gives 


I 
© 
3 


(a) = exp (i B A. de) PLA (N.18) 


replacing q by e. So 


w(a + da) 
L+AL 
= exp (+ / 4.00) Yy(A=0,+dx) 


— 00 


TAL T 
= exp (« | A. ae) exp (eJ A. ae) W(A = 0,x2 + da) 
x —00 


(1+ieA -da)exp (ic P A. de) [V(A = 0,2) + VY(A = 0, x) - da] 


2 


2 


x 
W(x) +ieA - daw(a) + exp (eJ A- de) Vy(A = 0,2). de, (N.19) 


to first order in da. On the right-hand side of (N.19) we see (i) the change dy 
of (N.17), due to ‘parallel transport’ as prescribed by the gauge connection A, 
and (ii) the change in y viewed as a function of æ, in the absence of A. The 
solution (N.18) gives, in fact, the ‘integrated’ form of the small displacement 
law (N.19). 

At this point the reader might object, going back to the €,, £g example, 
that we had made a lot of fuss about nothing: after all, no one forced us 
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FIA 


(b) 


FIGURE N.4 
Parallel transport (a) round a curved triangle on the surface of a sphere (b) 
round a triangle in a flat plane. 


to use the &,,é@ basis, and if we had simply used the ij basis (which is 
constant throughout the plane) we would have had no such ‘trouble’. This 
is a fair point, provided that we somehow knew that we are really doing 
physics in a ‘flat’ space, such as the Euclidean plane. But suppose instead that 
our two-dimensional space was the surface of a sphere. Then, an intuitively 
plausible definition of parallel transport is shown in figure N.4(a), in which 
transport is carried out around a closed path consisting of three great circle 
arcs A > B,B — C,C > A, with the rule that at each stage the vector 
is drawn ‘as parallel as possible’ to the previous one. It is clear from the 
figure that the vector we end up with at A, after this circuit, is no longer 
parallel to the vector we started with; in fact, it has rotated by 7/2 in this 
example, in which sth of the surface area of the unit sphere is enclosed by 
the triangle ABC. By contrast, the parallel transport of a vector round a 
flat triangle in the Euclidean plane leads to no such net change in the vector 
(figure N.4(b)). 

It seems reasonable to suppose that the information about whether the 
space we are dealing with is ‘flat’ or ‘curved’ is contained in the connection 
Tp- In a similar way, in the gauge case the analogy we have built up so far 
would lead us to expect that there are potentials A” which are somehow ‘flat’ 
(E = B = 0) and others which represent ‘curvature’ (non-zero E, B). This 
is what we discuss next. 


460 N. Geometrical Aspects of Gauge Fields 


FIGURE N.5 
Closed loop ABCD in q! — q? space. 


E 


N.2 Geometrical curvature and the gauge field strength 


tensor 


Consider a small closed loop in our (possibly curved) two-dimensional space 
— see figure N.5 — whose four sides are the coordinate lines q! = a,q! = 
a+ 6a,q¢ = b,q2 = b + ôb. We want to calculate the net change (if any) in 
6V“ as we parallel transport V around the loop. The change along A > B is 


q? =b,q!=a+ôa 
(6Vo)ag = — J TS, V dq 
q? =b,q1 =a 
~ —dal® (a, b)V? (a, b) 
to first order in da, while that along C > D is 


q? =b+8b,q! =a 
— J r% Vdq' 
q?=b+6b,q!=a+6a 


(dV*)op 


II 


II 


2=b+0b,q*=a 
dal), (a,b + ôb) V? (a, b + ôb). 


2 


Now a 
or yl 
Oq? 


T° (a, b + ôb) ~ T°): (a, b) + 6b 


and, remembering that we are parallel-transporting y, 
VI (a,b + 6b) = V7 (a, b) — T?,V*0b. 
Combining (N.20) and (N.21) to lowest order, we find 
or% 
3q? 


(1 ap + (5V)op ~ dash VTS aT" 


q?=b+0b,q*=a+80 
+ / ren V"dg!. 
q 


(N.20) 


(N.21) 


(N.22) 


(N.23) 


(N.24) 
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or, interchanging dummy indices y and 6 in the last term, 


ore. 
(OV) ap + (6V“)op ~ 6adb | Ta — aTa Vv”. (N.25) 
Similarly, 
a a Ol a ô y 
(6V“) Bo + (6V“)pa = ðaðb |— dq +P% yl V, (N.26) 


and so the net change around the whole small loop is 


gra, are 
(65V°) aBop = dadb | Ge = 7A +r’ rr] VY. (N.27) 


The indices ‘1’ and ‘2’ appear explicitly because the loop was chosen to go 

along these directions. In general, (N.27) would take the form 

Lp = Oye 
Oq? Og? 


(01 )ioop © | + Top — riste] VIdAP?  (N.28) 


where d44” is the area element. The quantity in brackets in (N.28) is the 
Reimann curvature tensor R% 5, (up to a sign, depending on conventions), 
which can clearly be calculated once the connection coefficients are known. 
A flat space is one for which all components R po = 0; the reader may 
verify that this is the case for our polar basis €,., & in the Euclidean plane. A 
non-zero value for any component of R% 8o means the space is curved. 

We now follow exactly similar steps to calculate the net change in 04 as 
given by (N.16), around the small two-dimensional rectangle defined by the 
coordinate lines zı = a, 11 = a + ĝa, 12 = b, £2 = b+ ôb, labelled as in figure 
N.5 but with q! replaced by xı and q? by £2. Then 


(ow) ap = —ieA'(a, b)w(a, b)da (N.29) 
and 


(ow)ep 


II 


+ieA! (a,b + ôb)y(a, b + 5b)5a 
= ie (A. b) + 250) [W(a, b) — ie A? (a, b) (a, b)9b]9a 
2 
= ¡ieA!*(a, b)jp(a, b)da 
y [OA Al 2 
+ ie ESTO —ieA” (a, b)A (a, b(a) dadb. (N.30) 


Combining (N.29) and (N.30) we find 


1 
(60) 18 + (9V)cp = ES wt a) Sa0b. (N.31) 
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Similarly, 
: 0A? 2 Al 42 
(5h) nc + (5h)pa © | ie =p — A'A] ôaôb, (N.32) 
1 


with the result that the net change around the loop is 


OA! ðA? 
x ie | === b. N. 
(6) aBcp ~ ie (= Dan, ) wad (N.33) 
For a general loop, (N.33) is replaced by 
_ (OAX DA 
(9) loop 1e (= = | pdz dx, 
= -ieF*""pdx,dx, (N.34) 


where PP” = OP A” — O” A! is the familiar field strength tensor of QED. 

The analogy we have been pursuing would therefore suggest that F*" = 0 
indicates ‘no physical effect’, while F*" Æ 0 implies the presence of a physical 
effect. Indeed, when A” has the ‘pure gauge’ form A“ = Oy the associated 
PH" is zero; this is because such an A” can clearly be reduced to zero by 
a gauge transformation (and also, consistently, because (O“0" — O”OH)y = 
0). If A” is not expressible as the 4-gradient of a scalar, then F*Y 4 0 
and an electromagnetic field is present, analogous to the spatial curvature 
revealed by R% go # 0. Once again, there is a satisfying consistency between 
this ‘geometrical’ viewpoint and the discussion of the Aharonov-Bohm effect 
in Section 2.6. As in our remarks at the end of the previous section, and 
equations (N.17)-(N.19), equation (2.83) can be regarded as the integrated 
form of (N.34), for spatial loops. Transport round such a loop results in a 
non-trivial net phase change if non-zero B flux is enclosed, and this can be 
observed. 

From this point of view there is undoubtedly a strong conceptual link 
between Einstein’s theory of gravity and quantum gauge theories. In the 
former, matter (or energy) is regarded as the source of curvature of space- 
time, causing the space-time axes themselves to vary from point to point, 
and determining the trajectories of massive particles; in the latter, charge is 
the source of curvature in an ‘internal’ space (the complex w-plane, in the 
U(1) case), a curvature which we call an electromagnetic field, and which has 
observable physical effects. 

The reader may consider repeating, for the local SU(2) case, the closed- 
loop transport calculation of (N.29)-(N.33). For this calculation, the place 
of the Abelian vector potential is taken by the matrix-valued non-Abelian 
potential A” = 7/2. A”. It will lead to the expression for the non-Abelian 
field strength tensor as calculated in section 13.1.2. 


O 


Dimensional Regularization 


After combining propagator denominators of the form (p? — m? + ie)~! by 
Feynman parameters (cf (10.40)), and shifting the origin of the loop momen- 
tum to complete the square (cf (10.42) and (11.16)), all one-loop Feynman 
integrals may be reduced to evaluating an integral of the form 


dk 1 
Ia(A,n) = | CEE (0.1) 
or to a similar integral with factors of k (such as k,,k,) in the numerator. We 
consider (O.1) first. 

For our purposes, the case of physical interest is d = 4, and n is commonly 
2 (e.g. in one-loop self-energies). Power-counting shows that (O.1) diverges 
as k — oo for d > 2n. The idea behind dimensional regularization ('t Hooft 
and Veltman 1972) is to treat d as a variable parameter, taking values smaller 
than 2n, so that (O.1) converges and can be evaluated explicitly as a function 
of d (and of course the other variables, including n).* Then the nature of the 
divergence as d — 4 can be exposed (much as we did with the cut-off pro- 
cedure in section 10.3), and dealt with by a suitable renormalization scheme. 
The crucial advantage of dimensional regularization is that it preserves gauge 
invariance, unlike the simple cut-off regularization we used in chapters 10 and 
11. 

We write 


1 a\"" f dix 1 
== (aa) eel (02) 


The d dimensions are understood as one time-like dimension k%, and d — 1 
spacelike dimensions. We begin by ‘Euclideanizing’ the integral, by setting 
k? = ik® with k* real. Then the Minkowskian square k? becomes —(k®)?—k? = 
—ke, and dk becomes id¢kp, so that now 


. n—-1 
—1 [2] dk 1 
I, = ——— | — Anii Se eS ; 
(n) (sx) la; (ka + A)’ 05) 
the “ie” may be understood as included in A. The integral is evaluated by 


1We concentrate here on ultraviolet divergences, but infrared ones (such as those met in 
section 14.4.2) can be dealt with too, by choosing d larger than 2n. 
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introducing the following way of writing (kg + A) 


(ha + A) =P dBe PUKE +4) | (0.4) 
which leads to 


1 O Cae 29 d¢kp 2 
ES 9 E -elki +A) 
los) ff aye p Qu 


The interchange of the orders of the 8 and kg integrations is permissible since 
Iq is convergent. The kg integrals are, in fact, a series of Gaussians: 


d 
co EAN — RA II / mae 
27 1 27 


j=l 
—BA d/2 
= ls) Sau) 
Hence 
o: 1 9 n-1 B 7 
n= pong Wn Gs) ~~ BA g-a/2 
_ — 2 —1)" er - ff ager BA gn—(d/2)-1 (0.7) 


The last integral can be written in terms of Euler’s integral for the gamma 
function T(z) defined by (see, for example, Boas 1983, chapter 11) 


T(z) = [ ale td, (0.8) 


Since T(n) = (n — 1)!, it is convenient to write (0.8) entirely in terms of T 


functions as 
ae (—1)” T(n- d/2) 4 (a/2)- 


UTE Tn) (09) 


Equation (0.9) gives an explicit definition of Iq which can be used for any 
value of d, not necessarily an integer. As a function of z, T(z) has isolated 
poles (see appendix F of volume 1) at z = 0,—1,—2,.... The behaviour near 
z = 0 is given by 


n= L OO, (0.10) 


where y is the Euler-Mascheroni constant having the value y = 0.5772. Using 


2T(2)=T(z+1), (0.11) 
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we find the behaviour near z = —1: 
-1 
T-1+2) = T(z) 
l-z 
1 
= [5 +1=+0(2)); (0.12) 
similarly 
1.1 3 


Consider now the case n = 2, for which T(n — d/2) in (0.9) will have a 
pole at d = 4. Setting d = 4 — e, the divergent behaviour is given by 


2 
T(2 — d/2) = E —y+0(e) (0.14) 
from (0.10). Ig(A, 2) is then given by 
i 


When A~‘/? and (47)~?+*/? are expanded in powers of e, for small e, the 
terms linear in e will produce terms independent of e when multiplied by the 
e”! in the bracket of (0.15). Using ze ~ 1+ elng + 0O(e?) we find 


i 


14.2) = 5 


É -y+ Inár = Ina +0(9| ! (0.16) 


Another source of e-dependence arises from the fact (see problem 15.7) that 
a gauge coupling which is dimensionless in d = 4 dimensions will acquire mass 
dimension p‘/? in d = 4 — e dimensions (check this!). A vacuum polarization 
loop with two powers of the coupling will then contain a factor ţi“. When 
expanded in powers of e, this will convert the In A in (0.16) to In(A/p). 

Renormalization schemes will subtract the explicit pole pieces (which di- 
verge as € — 0), but may also include in the subtraction certain finite terms as 
well. For example, in the ‘minimal subtraction’ (MS) scheme, one subtracts 
just the pole pieces; in the ‘modified minimal subtraction’ or MS (‘emm-ess- 
bar’) scheme (Bardeen et al. 1978) one subtracts the pole and the “—y+1n 47’ 
piece. 

The change from one scheme ‘A’ to another ‘B’ must involve a finite renor- 
malization of the form (Ellis et al. 1966, section 2.5) 


aP = af (1+ Ac +...). (0.17) 


Note that this implies that the first two coefficients of the P function are 
unchanged under this transformation, so that they are scheme-independent. 
Subsequent coefficients are scheme-dependent, as is the QCD parameter A 
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introduced in section 15.3. From (15.54) the two corresponding values of A 
are related by 


1) 1 e de 
In | — = = — 0.18 
(= 2 aA (|q2|) Bor? (14+...) ( ) 
Ay 
= Mo 0.19 
280 ( 


where we have taken |q?| + oo in (0.18) since the left-hand side is indepen- 
dent of |q?|. Hence the relationship between the A’s in different schemes is 
determined by the one-loop calculation which gives A; in (0.19). For example, 
changing from MS to MS gives (problem 15.8) 


Ag = At gexp(In 4r — y), (0.20) 


as the reader may check. 
Finally, consider the integral 


A 
uv = 
I! am Pe (0.21) 


From Lorentz covariance this must be proportional to the only second-rank 
tensor available, namely g*”: 


THY = Agr”. (0.22) 


The constant ‘A’ can be determined by contracting both sides of (0.21) with 
guv, using 9" guv = d in d-dimensions. So 


Pa a dk k? 
d] (2m)%(k2- A + ie)” 


T (tet af M 


(Dn AY 7H (—Mn—1-d/2) T(n- d/2) 
= Und { Tia "TO \ 
(=D AU T(n — 1 — d/2) 
=U d o ee 
(SAMBA 1 T(n — 1 — d/2) 
(4n)4/2 2 T(n) (ul, 


Using these results, one can show straightforwardly that the gauge-non-invariant 
part of (11.18) — i.e. the piece in braces — vanishes. With the technique 
of dimensional regularization, starting from a gauge-invariant formulation of 
the theory the renormalization programme can be carried out while retaining 
manifest gauge invariance. 
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Grassmann Variables 


In the path integral representation of quantum amplitudes (chapter 16) the 
fields are regarded as classical functions. Matrix elements of time-ordered 
products of bosonic operators could be satisfactorily represented (see the dis- 
cussion following (16.79)). But something new is needed to represent, for 
example, the time-ordered product of two fermionic operators: there must 
be a sign difference between the two orderings, since the fermionic operators 
anticommute. Thus it seems that to represent amplitudes involving fermionic 
operators by path integrals we must think in terms of ‘classical’ anticommut- 
ing variables. 

Fortunately, the necessary mathematics was developed by Grassmann in 
1855, and applied to quantum amplitudes by Berezin (1966). Any two Grass- 
mann numbers 01,02 satisfy the fundamental relation 


0102 + 0201 = 0, (P.1) 


and of course 

6; =0:=0, (P.2) 
Grassmann numbers can be added and subtracted in the ordinary way, and 
muliplied by ordinary numbers. For our application, the essential thing we 
need to be able to do with Grassmann numbers is to integrate over them. 
It is natural to think that, as with ordinary numbers and functions, integra- 
tion would be some kind of inverse of differentiation. So let us begin with 
differentiation. 


We define Oa) 
a 
ES P. 
=o (P.3) 
where a is any ordinary number, and 
20 (ta ei: (P.4) 
96, 142) — V2> x 
then necessarily 
0 
— (0102) = — 01. P.5 
Db, | 102) 1 (P.5) 


Consider now a function of one such variable, f(0). An expansion of f in 
powers of 6 terminates after only two terms because of the property (P.2): 


f(0) =a+00. (P.6) 
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i 2710) 

a = b, (P.7) 
but also əf 

992 = 0 (P.8) 


for any such f. Hence the operator 0/00 has no inverse (think of the matrix 
analogue A? = 0: if A-1 existed, we could deduce 0 = A~!(A?) = (A71A)A = 
A for all A). Thus we must approach Grassmann integration other than via 
an inverse of differentiation. 

We only need to consider integrals over the complete range of 0, of the 
form 


/ dof (0) = J do(a + b0). (P.9) 


Such an integral should be linear in f; thus it must be a linear function of 
a and b. One further property fixes its value: we require the result to be 
invariant under translations of 0 by 0 — 0 +n, where n is a Grassmann 
number. This property is crucial to manipulations made in the path integral 
formalism, for instance in ‘completing the square’ manipulations similar to 
those in section 16.3, but with Grassmann numbers. So we require 


Jasa +66) 5 fola +09). (P.10) 


This has changed the constant (independent of 0) term, but left the linear 
term unchanged. The only linear function of a and b which behaves like this 
is a multiple of b, which is conventionally taken to be simply b. Thus we define 


Jun +00) =b, (P.11) 


which means that integration is in some sense the same as differentiation! 

When we integrate over products of different 6’s, we need to specify a 
convention about the order in which the integrals are to be performed. We 
adopt the convention 


that is, the innermost integral is done first, then the next, and so on. 

Since our application will be to Dirac fields, which are complex-valued, 
we need to introduce complex Grassmann numbers, which are built out of 
real and imaginary parts in the usual way (this would not be necessary for 
Majorana fermions). Thus we may define 


1 
a 


1 


1 zZ 


(01 +102), Y* = —= (01 — 102), (P.13) 
and then 


~idydy* = dO,d6>. (P.14) 
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It is convenient to define complex conjugation to include reversing the order 
of quantities: 


(PX) = x*y*. (P.15) 


Then (P.14) is consistent under complex conjugation. 

We are now ready to evaluate some Gaussian integrals over Grassmann 
variables, which is essentially all we need in the path integral formalism. We 
begin with 


| fave = J [arwa urs) 


z J [arawa o) =. (P.16) 


Note that the analogous integral with ordinary variables is 


i hi dedy e be" +¥°)/2 = 2r /b. (P.17) 


The important point here is that, in the Grassman case, b appears with a 
positive, rather than a negative, power. On the other hand, if we insert a 
factor ww* into the integrand in (P.16), we find that it becomes 


J [arwa = | | avrag eur = (P.18) 


and the insertion has effectively produced a factor b~!. This effect of an 
insertion is the same in the ‘ordinary variables’ case: 


dady(x2 + y2)/2e~ 2? +45/2 = 27/92. P.19 
y 


Now consider a Gaussian integral involving two different Grassmann vari- 
ables: 


VA Cap? dy dp day e7% MY, (P.20) 


y= ( ja ) (P.21) 


and M is a 2 x 2 matrix, whose entries are ordinary numbers. The only 
terms which survive the integration are those which, in the expansion of the 
exponential, contain each of Yi, 41, 43 and %2 exactly once. These are the 
terms 


where 


5 [Mu Maat + BOT) + Ma Maus + Viata). 
(P.22) 
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To integrate (P.22) conveniently, according to the convention (P.12), we need 
to re-order the terms into the form Y243414i; this produces 


(Mia Mo — Miz Mo) (da var), (P.23) 


and the integral (P.20) is therefore just 


/ J dyždydyždy e M% = detM. (P.24) 


The reader may show, or take on trust, the obvious generalization to N in- 
dependent complex Grassmann variables %1, Ya, Y3, ..., Wn. This result 
is sufficient to establish the assertion made in section 16.4 concerning the 
integral (16.90), when written in ‘discretized’ form. 

We may contrast (P.24) with an analogous result for two ordinary complex 
numbers 21, 22. In this case we consider the integral 


J | ectendesane™, (P.25) 


where z is a two-component column matrix with elements zı and z2. We take 
the matrix H to be Hermitian, with positive eigenvalues bı and b2. Let H be 
diagonalized by the unitary transformation 


(ue) Z 


dz dz, = detU dzıd22, (P.27) 


with UUt = I. Then 


and so 
dzįdz{*dz54dz* = dz1dzidz2d23, (P.28) 


since |detU|? = 1. The integral (P.25) then becomes 
| adate Payara, (P.29) 


the integrals converging provided b,,b2 > 0. Next, setting 21 = (zı + 
iy1)/V2, za = (12 + iye)/V2, (P.29) can be evaulated using (P.17), and the 
result is proportional to (b1b2)7*, which is the inverse of the determinant 
of the matrix H, when diagonalized. Thus — compare (P.16) and (P.17) — 
Gaussian integrals over complex Grassmann variables are proportional to the 
determinant of the matrix in the exponent, while those over ordinary complex 
variables are proportional to the inverse of the determinant. 

Returning to integrals of the form (P.20), consider now a two-variable 
(both complex) analogue of (P.18): 


| avi anus vg eee, (P30) 
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This time, only the term 434 in the expansion of the exponential will survive 
the integration, and the result is just —Mi2. By exploring a similar integral 
(still with the term 41143) in the case of three complex Grassmann variables, 
the reader should be convinced that the general result is 


[] favre debe ee ¥ MY = (Mu detM. (P.31) 


With this result we can make plausible the fermionic analogue of (16.87), 
namely 


] DYDY v(w1)h(e2)exp[— f dtz au 2- my], 
J DyDyexp|- fred (i 2 — m)y] i 
(P.32) 
note that 4 and 4* are unitarily equivalent. The denominator of this expres- 
sion is? det(i @—m), while the numerator is this same determinant multiplied 
by the inverse of the operator (i J—m); but this is just (p—m)-L in momentum 
space, the familiar Dirac propagator. 


(QT {v(a1)¥(w2)} 10) = 


1The reader may interpret this as a finite-dimensional determinant, after discretization. 
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Q 


Feynman Rules for Tree Graphs in QCD and 
the Electroweak Theory 


Q.1 QCD 
Q.1.1 External particles 
Quarks 


The SU(3) colour degree of freedom is not written explicitly; the spinors have 
3 (colour) x 4 (Dirac) components. For each fermion or antifermion line 
entering the graph include the spinor 


u(p,s) or v(p;s) (Q.1) 
and for spin-4 particles leaving the graph the spinor 
u(p',s’) or  d(p,s), (Q.2) 
as for QED. 
Gluons 


Besides the spin-1 polarization vector, external gluons also have a ‘colour 


polarization’ vector a“(c = 1,2,...,8) specifying the particular colour state 
involved. For each gluon line entering the graph include the factor 
Enlk, A) a“ (Q.3) 


and for gluons leaving the graph the factor 


ee A’) ae (Q.4) 


.1.2 Propagators 
pas 
Quark 
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Gluon 


k2 


for a general € gauge. Calculations are usually performed in Lorentz or Feyn- 
man gauge with € = 1 and gluon propagator equal to 


00000000 — FG au da (Q.7) 


V0000000 — > (mao paS o) ga (Q.6) 


k2 
Here a and b run over the 8 colour indices 1,2,...,8. 
Q.1.3 Vertices 
La 
“igs Yu 
uka A,ky,C 


—9sfabelGuv (ki — k2)a + g9ual[ka — k3)u + grxu(k3 — ku] 


vb 


pk, a A 9 pak 


—ig? [fave fede(GurIup = IupJvr) + Fade foce[Iuv 9Ap T 9ux9vp) + 
face fave (9up9vA = Guv9Xp)| 


It is important to remember that the rules given above are only adequate 
for tree diagram calculations in QCD (see section 13.3.3). 


E Eo o o o — —————— ooo 


Q.2 The electroweak theory 


For tree graph calculations, it is convenient to use the U gauge Feynman rules 
(sections 19.5 and 19.6) in which no unphysical particles appear. These U 
gauge rules are given below for the leptons l = (e, p, T), 11 = (Ve, Vu, Vr); for 
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the t3 = +1/2 quarks denoted by f, where f = u,c,t; and for the t3 = —1/2 
CKM-mixed quarks denoted by f’ where f = d’,s’,b’. 
Note that for simplicity we do not include neutrino flavour mixing. 


Q.2.1 External particles 
Leptons and quarks 


For each fermion or antifermion line entering the graph include the spinor 
ulp,s) or v(p,s) (Q.8) 
and for spin-4 particles leaving the graph the spinor 
u(p’,s’) or  v(p,s'). (Q.9) 


Vector bosons 


For each vector boson line entering the graph include the factor 
eu(k, A) (Q.10) 


and for vector bosons leaving the graph the factor 


E (kA). (Q.11) 
Q.2.2  Propagators 
Leptons and quarks 
i pm 
Vector bosons (U gauge) 
w+,z0 i 
NN N = (Guy + kuku / m3) (Q.13) 
k2 — M2 H H V 


where ‘V’ stands for either ‘W’ (the W-boson) or ‘Z’ (the ZO). 


Higgs particle 


a pm = r (Q.14) 
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Q.2.3 Vertices 
Charged current weak interactions 


Leptons 


Quarks 


: 9 1% 
TIA Va 5} Ve pr 


Neutral current weak interactions (no neutrino mizing) 


Fermions 
=i fi- f i+ 
A Gi om + CR 22), 
where 
cf = ti — sin20wQ; (Q.15) 
ch = — sin? 0wQp, (Q.16) 


and f stands for any fermion. 


Vector boson couplings 


(i) Trilinear couplings: 
yWtW- vertex 
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Koh 


Wiky Wokyh 


ielgva(ki ko) grulke ky)v + Gil = k1)a] 
Z°WtW- vertex 


Z ko 


ig cos Ow [gva(ki — ko)u + 9aulko — k3)v + guv (ks — k1)a] 


(ii) Quadrilinear couplings: 


—ig? cos? Ow (2ga89pv — GapIBv — Jav 9811) 


Wu wry 
+ = 
Wa wp 


ig? (29pagva — 9upgov — Iuv Jap) 
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Higgs couplings 


(i) Trilinear couplings 
HWtW vertex 


HZ°Z° vertex 


Day SU za 


ig 
cos Ow 


Mz9A 


Fermion Yukawa couplings (fermion mass mp) 


Hi 


ALE 
2 Mw 
Trilinear self-coupling 
Hi 
H H 
—i 38mg 
2Mw 


(ii) Quadrilinear couplings: 
HHW*W_ vertex 


Q.2. The electroweak theory 479 


zo ee Zy 


HHZZ vertex 


on 
ig 
2 cos? Ow Suv 


Quadrilinear self-coupling 


H H 
e i3mpg? 
4M2 


Taylor & Francis 
Taylor & Francis Group 


http://taylorandfrancis.com 


References 


Aad G et al. 2012a (ATLAS Collaboration) Phys. Lett. B 710 49 

——2012b Phys. Lett. B 716 1 

Aaij R et al. 2012 (LHCb Collaboration) Phys. Rev. Lett. 108 111602 

Aaltonen T et al. 2008 (CDF Collaboration) Phys. Rev. Lett. 100 121802 

——2012 (CDF and DO Collaborations) Phys. Rev. Lett. 109 071804 

Abachi S et al. 1995a (DO Collaboration) Phys. Rev. Lett. 74 2422 

——1995b Phys. Rev. Lett. 74 2632 

Abashian A et al. 2001 (Belle Collaboration) Phys. Rev. Lett. 86 2509 

Abbiendi G et al. 2001 Eur. Phys. J. C 20 601 

Abdurashitov J N et al. 2009 Phys. Rev. C 80 015807 

Abe F et al. 1994a (CDF Collaboration) Phys. Rev. D 50 2966 

——1994b Phys. Rev. Lett. 73 225 

——1995a Phys. Rev. D 52 4784 

——1995b Phys. Rev. Lett. 74 2626 

Abe K 1991 Proc. 25th Int. Conf. on High Energy Physics eds K K Phua and Y 
Yamaguchi (Singapore: World Scientific) p 33 

Abe K et al. 2000 Phys. Rev. Lett. 84 5945 

——2011 (T2K Collaboration) Phys. Rev. Lett. 107 041801 

Abe S et al. 2008 (KamLAND Collaboration) Phys. Rev. Lett. 100 221803 

Abramovicz H et al. 1982a Z. Phys. C 12 289 

——1982b Z. Phys. C 13 199 

——1983 Z. Phys. C 17 283 

Abreu P et al. 1990 Phys. Lett. B 242 536 

Adamson P et al 2011 (MINOS Collaboration) Phys. Rev. Lett. 106 181801 

Adler S L 1963 Phys. Rev. 143 1144 

——1965 Phys. Rev. Lett. 14 1051 

——1969 Phys. Rev. 177 2426 

—1970 Lectures on Elementary Particles and Quantum Field Theory (Proceedings 
of the Brandeis Summer Institute) vol 1, ed S Deser et al. (Boston, MA: MIT) 

Adler S L and Bardeen W A 1969 Phys. Rev. 182 1517 

Aharmim B et al. 2010 (SNO Collaboration) Phys. Rev. C 81 055504 

Ahmad Q R et al. 2001 (SNO Collaboration) Phys. Rev. Lett. 87 071301 

——2002 Phys. Rev. Lett. 89 011301 

Ahn J K et al. 2012 (RENO Collaboration) Phys. Rev. Lett. 108 191802 

Ahn M H et al. 2006 (K2K Collaboration) Phys. Rev. D 74 072003 

Aitchison I J R 2007 Supersymmetry in Particle Physics An Elementary Introduction 
(Cambridge: Cambridge University Press) 

Aitchison I J R et al. 1995 Phys. Rev. B 51 6531 

Akhundov A A et al. 1986 Nucl. Phys. B 276 1 

Akrawy M Z et al. 1990 (OPAL Collaboration) Phys. Lett. B 235 389 


481 


482 References 


Alavi-Harati A et al. 2003 (KTeV Collaboration) Phys. Rev. D 67 012005; ibid. D 
70 079904 (erratum) 

Ali A and Kramer G 2011 Eur. Phys. J. H 36 245 

Allaby J et al. 1988 J. Phys. C: Solid State Phys. 38 403 

Allasia D et al. 1984 Phys. Lett. B 135 231 

——1985 Z. Phys. C 28 321 

Allton C R et al. 1995 Nucl. Phys. B 437 641 

—2002 (UKQCD Collaboration) Phys. Rev. D 65 054502 

Alper B et al. 1973 Phys. Lett. B 44 521 

Altarelli G 1982 Phys. Rep. 81 1 

Altarelli G and Parisi G 1977 Nucl. Phys. B 126 298 

Altarelli G et al. 1978a Nucl. Phys. B 143 521 

——1978b Nucl. Phys. B 146 544(E) 

——1979 Nucl. Phys. B 157 461 

——1989 Z. Phys. at LEP-1 CERN 89-08 (Geneva) 

Altman M et al. 2005 Phys. Lett. B 616 174 

Amaudruz P et al. 1992 (NMC Collaboration) Phys. Lett. B 295 159 

An F P et al. 2012 (Daya Bay Collaboration) Phys. Rev. Lett. 108 171803 

Anderson P W 1963 Phys. Rev. 130 439 

Andreotti E et al. (CUORCINO Collaboration) 2011 Astropart. Phys. 34 822 

Antoniadis I 2002 2001 European School of High Energy Physics ed N Ellis and J 
March-Russell CERN 2002-002 (Geneva) pp 301ff 

Apollonio M et al. 2003 (CHOOZ Collaboration) Eur. Phys. J. C 27 331 

Appel J A et al. 1986 Z. Phys. C 30 341 

Appelquist T and Chanowitz M S 1987 Phys. Rev. Lett. 59 2405 

Arnison G et al. 1983a Phys. Lett. B 122 103 

——1983b Phys. Lett. B 126 398 

—1984 Phys. Lett. B 136 294 

—1985 Phys. Lett. B 158 494 

—1986 Phys. Lett. B 166 484 

Arnold R et al. (NEMO Collaboration) 2006 Nucl. Phys. A 765 483 

Attwood D et al. 2001 Phys. Rev. D 63 036005 

Aubert B et al. 2001 (BaBar Collaboration) Phys. Rev. Lett. 86 2515 

——2004 Phys. Rev. Lett. 93 131801 

——2007a Phys. Rev. Lett. 99 171803 

——2007b Phys. Rev. D 76 102004 

——2007c Phys. Rev. Lett. 98 211802 

—2008 Phys. Rev. D 78 034023 

——2010 Phys. Rev. Lett. 105 121801 

Aubin C et al. 2004 Phys. Rev. D 70 09505 

Ayres D S (NOvA Collaboration) 1995 arXiv:hep-ex/0503053 

Bagnaia P et al. 1983 Phys. Lett. B 129 130 

——1984 Phys. Lett. B 144 283 

Bahcall J N et al. 1968 Phys. Rev. Lett. 20 1209 

—2001 Astrophys. J. 555 990 

——2005 ibid. 621 L85 

Baikov P A et al. 2008 Phys. Rev. Lett. 101 012002 

—2009 Nucl. Phys. Proc. Suppl. 189 49 

Bailin D 1982 Weak Interactions (Bristol: Adam Hilger) 


References 483 


Banks T et al. 1976 Phys. Rev. D 13 1043 

Banner M et al. 1982 Phys. Lett. B 118 203 

——1983 Phys. Lett. B 122 476 

Barber D P et al. 1979 Phys. Rev. Lett. 43 830 

Bardeen J, Cooper L N and Schrieffer J R 1957 Phys. Rev. 108 1175 

Bardeen W A, Fritzsch H and Gell-Mann M 1973 in Scale and Conformal Symmetry 
in Hadron Physics ed R Gatto (New York: Wiley) pp 139-151 

Bardeen W A et al. 1978 Phys. Rev. D 18 3998 

Bardin D Yu et al. 1989 Z. Phys. C 44 493 

Barezyk A et al. 2008 (FAST Collaboration) Phys. Lett. B 663 172 

Barger V et al. 1983 Z. Phys. C 21 99 

Barr G D et al. 1993 (NA31 Collaboration) Phys. Lett. B 317 233 

Bartel et al. 1986 (JADE Collaboration) Z. Phys. C 33 23 

Batley J R et al. 2002 (NA48 Collaboration) Phys. Lett. B 544 97 

Beenakker W and Hollik W 1988 Z. Phys. C 40 569 

Belavin A A et al. 1975 Phys. Lett. B 59 85 

Bell J S and Jackiw R 1969 Nuovo Cimento A 60 47 

Benvenuti A C et al. 1989 (BCDMS Collaboration) Phys. Lett. B 223 485 

Berends F A et al. 1981 Phys. Lett. B 103 124 

Berezin F A 1966 The Method of Second Quantisation (New York: Academic) 

Bergsma F et al. 1983 Phys. Lett. B 122 465 

Beringer J et al. 2012 (Partial Data Group) Phys. Rev. D 86 010001 

Berman S M, Bjorken J D and Kogut J B 1971 Phys. Rev. D 4 3388 

Bernard C W, Golterman M F and Shamir Y 2006 Phys. Rev. D 73 114511 

Bernreuther W and Wetzel W 1982 Nucl. Phys. B 197 128 

Bernstein J 1974 Rev. Mod. Phys. 46 7 

Bethke S 2009 Eur. Phys. J. C 64 689 

Bethke S et al. 1988 (JADE Collaboration) Phys. Lett. B 213 235 

Bettini A 2008 Introduction to Elementary Particle Physics (Cambridge: Cambridge 
University Press) 

Bigi I I and Sanda A I 2000 CP Violation (Cambridge: Cambridge University Press) 

Bijnens J et al. 1996 Phys. Lett. B 374 2010 

Binney J J et al. 1992 The Modern Theory of Critical Phenomena (Oxford: Claren- 
don) 

Bjorken J D 1973 Phys. Rev. D8 4098 

Bjorken J D and Glashow S L 1964 Phys. Lett. 11 255 

Blatt J M 1964 Theory of Superconductivity (New York: Academic) 

Bloch F and Nordsieck A 1937 Phys. Rev. 52 54 

Bludman S A 1958 Nuovo Cimento 9 443 

Boas M L 1983 Mathematical Methods in the Physical Sciences (New York: Wiley) 

Bogoliubov N N 1947 J. Phys. USSR 11 23 

——1958 Nuovo Cimento 7 794 

Bogoliubov N N et al. 1959 A New Method in the Theory of Superconductivity (New 
York: Consultants Bureau, Inc.) 

Bouchiat C C et al. 1972 Phys. Lett. B 38 519 

Bouchiendra R et al. 2011 Phys. Rev. Lett. 106 080801 

Branco G C et al. 1999 CP Violation (Oxford: Oxford University Press) 

Brandelik R et al. 1979 Phys. Lett. B 86 243 

Brodsky S J, Lepage G P and Mackenzie P B 1983 Phys. Rev. D 28 228 


484 References 


Budny R 1975 Phys. Lett. B 55 227 

Buras A J et al. 1994 Phys. Rev. D 50 3433 

Biisser F W et al. 1972 Proc. XVI Int. Conf. on High Energy Physics (Chicago, IL) 
vol 3 (Batavia: FNAL) 

——1973 Phys. Lett. B 46 471 

Cabibbo N 1963 Phys. Rev. Lett. 10 531 

Cabibbo N et al. 1979 Nucl. Phys. B 158 295 

Cacciari M et al. 2008 JHEP 0804 063 

Callan C G 1970 Phys. Rev. D 2 1541 

Callan C, Coleman S, Wess J, and Zumino B 1969 Phys. Rev. 177 2247 

Campbell J M et al. 2007 Rept. on Prog. in Phys. 70 89 

Carruthers P A 1966 Introduction to Unitary Symmetry (New York: Wiley) 

Carter A B and Sanda A I 1980 Phys. Rev. Lett. 45 952 

——1981 Phys. Rev. D 23 1567 

Caswell W E 1974 Phys. Rev. Lett. 33 244 

Catani S et al. 1991 Phys. Lett. B 269 432 

——1993 Nucl. Phys. B 406 187 

Ceccucci A et al. 2010 in Nakamura K et al. 2010 

Celmaster W and Gonsalves R J 1980 Phys. Rev. Lett. 44 560 

Chadwick J 1932 Proc. R. Soc. A 136 692 

Chanowitz M et al. 1978 Phys. Lett. B 78 285 

——1979 Nucl. Phys. B 153 402 

Chao Y et al. 2005 (Belle Collaboration) Phys. Rev. D 71 031502 

Charles J et al. 2005 (CKMfitter Group) Eur. Phys. J. C 41 1 

Chatrchyan S et al. 2012a (CMS Collaboration) Phys. Lett. B 710 26 

—2012b Phys. Lett. B 716 30 

Chau L L and Keung W Y 1984 Phys. Rev. Lett. 53 1802 

Chen M-S and Zerwas P 1975 Phys. Rev. D 12 187 

Cheng T-P and Li L-F 1984 Gauge Theory of Elementary Particle Physics (Oxford: 

Clarendon) 

Chetyrkin K G et al. 1979 Phys. Lett. B 85 277 

——1997 Phys. Rev. Lett. 79 2184 

Chitwood D B et al. 2007 (MuLan Collaboration) Phys. Rev. Lett. 99 032001 

Christenson J H et al 1964 Phys. Rev. Lett. 13 138 

Cleveland B T et al. 1998 Astrophys. J. 496 505 

Cohen A G et al. 2009 Phys. Lett. B 678 191 

Colangelo G et al. 2001 Nucl. Phys. B 603 125 

Coleman S 1985 Aspects of Symmetry (Cambridge: Cambridge University Press) 

——1966 J. Math. Phys. 7 787 

Coleman S and Gross D J 1973 Phys. Rev. Lett. 31 851 

Coleman S, Wess J and Zumino B 1969 Phys. Rev. 177 2239 

Collins J C and Soper D E 1987 Annu. Rev. Nucl. Part. Sci. 37 383 

——1998 Phys. Lett. B 438 184 

Collins P D B and Martin A D 1984 Hadron Interactions (Bristol: Adam Hilger) 

Combridge B L et al. 1977 Phys. Lett. B 70 234 

Combridge B L and Maxwell C J 1984 Nucl. Phys. B 239 429 


References 485 


Commins E D and Bucksbaum P H 1983 Weak Interactions of Quarks and Leptons 
(Cambridge: Cambridge University Press) 

Consoli M et al. 1989 Z. Phys. at LEP-I ed G Altarelli et al. CERN 89-08 (Geneva) 

Cooper L N 1956 Phys. Rev. 104 1189 

Cornwall J M et al 1974 Phys. Rev. D 10 1145 

Cowan C L et al. 1956 Science 124 103 

Czakon M 2005 Nucl. Phys. B 710 485 

Dalitz R H 1953 Phil. Mag. 44 1068 

——1965 High Energy Physics ed C de Witt and M Jacob (New York: Gordon and 
Breach) 

Danby G et al. 1962 Phys. Rev. Lett. 9 36 

Davies C T H et al. 2008 (HPQCD Collaboration) Phys. Rev. D 78 114507 

Davies C T H et al. 2004 (HPQCD, UKQCD, MILC and Fermilab Collaborations) 
Phys. Rev. Lett. 92 022001 

Davis R 1955 Phys. Rev. 97 766 

——1964 Phys. Rev. Lett. 12 303 

Davis R et al. 1968 Phys. Rev. Lett. 20 1205 

Dawson S et al. 1990 The Higgs Hunters Guide (Reading, MA: Addison-Wesley) 

de Groot J G H et al. 1979 Z. Phys. C 1 143 

DiLella L 1985 Annu. Rev. Nucl. Part. Sci. 35 107 

——1986 Proc. Int. Europhysics Conf. on High Energy Physics, Bari, Italy, July 
1985 eds L Nitti and G Preparata (Bari: Laterza) pp 761 

Dine M and Sapirstein J 1979 Phys. Rev. Lett. 43 668 

Dirac P A M 1931 Proc. R. Soc. A 133 60 

Dittmaier S et al. 2011 (LHC Higgs Cross section Working Group Collaboration) 
Handbook of LHC Higgs Cross sections: 1. Inclusive Observables CERN-2011- 
002 (arXiv:1101.0593 [hep-ph]) 

——2012 Handbook of LHC Higgs Cross Sections: 2. Differential Distributions 
CERN-2012-002 (arXiv:1201.3084 [hep-ph]) 

Dokshitzer Yu L 1977 Sov. Phys. JETP 46 641 

Donoghue J F, Golowich E and Holstein B R 1992 Dynamics of the Standard Model 
(Cambridge: Cambridge University Press) 

Duke D W and Owens J F 1984 Phys. Rev. D 30 49 

Diirr S et al. (Budapest-Marseille-Wuppertal Collaboration) Science 322 1224 

Eden R J, Landshoff P V, Olive D I and Polkinghorne J C 1966 The Analytic S- 
Matrix (Cambridge: Cambridge University Press) 

Eichten E et al. 1980 Phys. Rev. D 21 203 

Einhorn M B and Wudka J 1989 Phys. Rev. D 39 2758 

Eitel K et al. 2005 Nucl. Phys. (Proc. Suppl.) B 143 197 

Elias-Miro J et al. 2012 Phys. Lett. B 709 222 

Ellis J et al. 1976 Nucl. Phys. B 111 253 

——1977 Erratum ibid. B 130 516 

——1994 Phys. Lett. B 333 118 

Ellis R K, Stirling W J and Webber B R 1996 QCD and Collider Physics (Cambridge: 
Cambridge University Press) 

Ellis S D and Soper D E 1993 Phys. Rev. D 48 3160 

Ellis S D et al. 2008 Prog. Part. Nucl. Phys. 60 484 

Englert F and Brout R 1964 Phys. Rev. Lett. 13 321 


486 References 


Enz C P 1992 A Course on Many-Body Theory Applied to Solid-State Physics (World 
Scientific Lecture Notes in Physics 11) (Singapore: World Scientific) 

——2002 No Time to be Brief (Oxford: Oxford University Press) 

Fabri E and Picasso L E 1966 Phys. Rev. Lett. 16 408 

Faddeev L D and Popov V N 1967 Phys. Lett. B 25 29 

Feinberg G et al 1959 Phys. Rev. Lett. 3 527, especially footnote 9 

Fermi E 1934a Nuovo Cimento 11 1 

——1934b Z. Phys. 88 161 

Feynman R P 1963 Acta Phys. Polon. 26 697 

——1977 in Weak and Electromagnetic Interactions at High Energies ed R Balian 
and C H Llewellyn Smith (Amsterdam: North-Holland) p 121 

Feynman R P and Gell-Mann M 1958 Phys. Rev. 109 193 

Feynman R P and Hibbs A R 1965 Quantum Mechanics and Path Intergrals (New 
York: McGraw-Hill) 

Fritzsch H and Gell-Mann M 1972 Proc. XVI Int. Conf. on High Energy Physics, 
Batavia IL eds J D Jackson and R G Roberts, pp 135-165 

Fritzsch H, Gell-Mann M and Leutwyler H 1973 Phys. Lett. B 47 365 

Fukuda Y et al. 1998 Phys. Rev. Lett. 81 1562 

Fukugita M and Yanagida T 1986 Phys. Lett. B 174 45 

Gaillard M K and Lee B W 1974 Phys. Rev. D 10 897 

Gamow G and Teller E 1936 Phys. Rev. 49 895 

Gasser J and Leutwyler H 1982 Phys. Rep. 87 77 

—— 1984 Ann. Phys. 158 142 

——1985 Nucl. Phys. B 250 465 

Geer S 1986 High Energy Physics 1985, Proc. Yale Theoretical Advanced Study In- 
stitute eds Bowick M J and Gursey F (Singapore: World Scientific) 

Gell-Mann M 1961 California Institute of Technology Report CTSL-20 (reprinted in 

Gell-Mann and Ne’eman 1964) 

Gell-Mann M et al. 1979 Supergravity ed D Freedman and P van Nieuwenhuizen 

(Amsterdam: North-Holland) p 315 

Gell-Mann M and Levy M 1960 Nuovo Cimento 16 705 

Gell-Mann M and Low F E 1954 Phys. Rev. 95 1300 

Gell-Mann M and Ne’eman 1964 The Eightfold Way (New York: Benjamin) 

Gell-Mann M and Pais A 1955 Phys. Rev. 97 1387 

Georgi H et al. 1978 Phys. Rev. Lett. 40 692 

Georgi H and Politzer H D 1974 Phys. Rev. D 9 416 

Gibbons L K 1993 (E731 Collaboration) Phys. Rev. Lett. 70 1203 

Ginsparg P and Wilson K G 1982 Phys. Rev. D 25 25 

Ginzburg V I and Landau L D 1950 Zh. Eksp. Teor. Fiz. 20 1064 

Giri A et al. 2003 Phys. Rev. D 68 054018 

Glashow S L 1961 Nucl. Phys. 22 579 

Glashow S L et al. 1978 Phys. Rev. D 18 1724 

Glashow S L, Hiopoulos J and Maiani L 1970 Phys. Rev. D 2 1285 

Goldberger M L and Treiman S B 1958 Phys. Rev. 95 1300 

Goldhaber M et al 1958 Phys. Rev. 109 1015 

Goldstone J 1961 Nuovo Cimento 19 154 

Goldstone J, Salam A and Weinberg S 1962 Phys. Rev. 127 965 

Gorishnii S G and Larin S A 1986 Phys. Lett. 172 109 

Gorishnii S G et al. 1991 Phys. Lett. B 259 144 


References 487 


Gorkov L P 1959 Zh. Eksp. Teor. Fiz. 36 1918 

Gottschalk T and Sivers D 1980 Phys. Rev. D 21 102 

Gray A et al 2005 (HPQCD and UKQCD Collaborations) Phys. Rev. D 72 094507 

Greenberg O W 1964 Phys. Rev. Lett. 13 598 

Gribov V N and Lipatov L N 1972 Sov. J. Nucl. Phys. 15 438 

Gronau M 1991 Phys. Lett. B 265 389 

Gronau M and London D 1990 Phys. Rev. Lett. 65 3381 

Gross D J and Llewellyn Smith C H 1969 Nucl. Phys. B 14 337 

Gross D J and Wilczek F 1973 Phys. Rev. Lett. 30 1343 

——1974 Phys. Rev. D 9 980 

Grossman Y et al. 2005 Phys. Rev. D 72 031501 

Gupta R S et al. 2012 How well do we need to measure the Higgs boson couplings? 
arXiv:1206:3560 [hep-ph] 

Guralnik G S et al. 1964 Phys. Rev. Lett. 13 585 

—— 1968 Advances in Particle Physics vol 2, ed R Cool and R E Marshak (New 
York: Interscience) pp 567ff 

Haag R 1958 Phys. Rev. 112 669 

Hagiwara K et al. 2002 Phys. Rev. D 66 010001 

Halzen F and Martin A D 1984 Quarks and Leptons (New York: Wiley) 

Hamberg R et al 1991 Nucl. Phys. B 359 343 

Hambye T and Reisselmann K 1997 Phys. Rev. D 55 7255 

Hammermesh M 1962 Group Theory and its Applications to Physical Problems 
(Reading, MA: Addison-Wesley) 

Han M Y and Nambu Y 1965 Phys. Rev. B 139 1066 

Harrison P F and Quinn H R 1998 The BaBar physics book: Physics at an asym- 
metric B factory SLAC-R-0504 

Hasenfratz P et al. 1998 Phys. Lett. B 427 125 

Hasenfratz P and Niedermayer F 1994 Nucl. Phys. B 414 785 

Hasert F J et al. 1973 Phys. Lett. B 46 138 

Heisenberg W 1932 Z. Phys. 77 1 

Higgs P W 1964 Phys. Rev. Lett. 13 508 

——1966 Phys. Rev. 145 1156 

Hirata K S et al. 1989 Phys. Rev. Lett. 63 16 

Hocker A et al. 2001 Eur. Phys. J. C 21 225 

Hollik W 1990 Fortsch. Phys. 38 165 

——1991 1989 CERN-JINR School of Physics CERN 91-07 (Geneva) p 50ff 

Hornbostel K, Lepage G P and Morningstar C 2003 Phys. Rev. D 67 034023 

Hosaka J et al. 2006 Phys. Rev. D 73 112001 

Hughes R J 1980 Phys. Lett. B 97 246 

——1981 Nucl. Phys. B 186 376 

Isgur N and Wise M B 1989 Phys. Lett. B 232 113 

——1990 Phys. Lett. B 237 527 

Isidori G et al. 2001 Nucl. Phys. B 609 387 

Jackiw R 1972 Lectures in Current Algebra and its Applications ed S B Treiman, R 
Jackiw and D J Gross (Princeton, NJ: Princeton University Press) pp 97-254 

Jacob M and Landshoff P V 1978 Phys. Rep. C 48 285 

Jarlskog C 1985 Phys. Rev. Lett. 55 1039 

Jones D R T 1974 Nucl. Phys. B 75 531 

Jones H F 1990 Groups, Representations and Physics (Bristol: IOP Publishing) 


488 References 


Kabir P K 1968 The CP Puzzle: Strange Decays of the Neutral Kaon (London and 
New York: Academic Press) 

Kadanoff L P 1977 Rev. Mod. Phys. 49 267 

Kaplan D B 1992 Phys. Lett. B 288 342 

Kayser B 1981 Phys. Rev. D 24 110 

Kennedy D C et al. 1989 Nucl. Phys. B 321 83 

Kennedy D C and Lynn B W 1989 Nucl. Phys. B 322 1 

Kibble T W B 1967 Phys. Rev. 155 1554 

Kim K J and Schilcher K 1978 Phys. Rev. D 17 2800 

Kinoshita T 1962 J. Math. Phys. 3 650 

Kittel C 1987 Quantum Theory of Solids second revised printing (New York: Wiley) 

Klapdor-Kleingrothaus H V et al. (Heidelberg-Moscow Collaboration) 2001 Eur. 
Phys. J. A 12 147 

—2006 Mod. Phys. Lett. A 21 1547 

Klein O 1948 Nature 161 897 

Kluth S 2006 Rept. on Prog. in Phys. 69 1771 

Kobayashi M 2009 Rev. Mod. Phys. 81 1019 

Kobayashi M and Maskawa K 1973 Prog. Theor. Phys. 49 652 

Kogut J B and Susskind L 1975 Phys. Rev. D 11 395 

Kramer G and Lampe B 1987 Z. Phys. C 34 497 

Krastev P I and Petcov S T 1988 Phys. Lett. B 205 84 

Kugo T and Ojima I 1979 Prog. Theor. Phys. Suppl. 66 1 

Kunszt Z and Piétarinen E 1980 Nucl. Phys. B 164 45 

Kusaka A et al. 2007 (Belle Collaboration) Phys. Rev. Lett. 98 221602 

Kuzmin V A, Rubakov V A and Shaposhnikov M E 1985 Phys. Lett. B 155 36 

Landau L D 1948 Dokl. Akad. Nauk. USSR 60 207 

——1957 Nucl. Phys. 3 127 

Landau L D and Lifshitz E M 1980 Statistical Mechanics part 1, 3rd edn (Oxford: 
Pergamon) 

Langacker P (ed) 1995 Precision Tests of the Standard Electroweak Model (Singa- 
pore: World Scientific) 

Larin S A and Vermaseren J A M 1991 Phys. Lett. B 259 345 

——1993 Phys. Lett. B 303 334 

Lautrup B 1967 Kon. Dan. Vid. Selsk. Mat.-Fys. Med. 35 1 

Lee B W et al. 1977a Phys. Rev. Lett. 38 883 

——1977b Phys. Rev. D 16 1519 

Lee T D and Nauenberg M 1964 Phys. Rev. B 133 1549 

Lee T D, Rosenbluth R and Yang C N 1949 Phys. Rev. 75 9905 

Lee T D and Yang C N 1956 Phys. Rev. 104 254 

——1957 Phys. Rev. 105 1671 

——1962 Phys. Rev. 128 885 

LEP 2003 (The LEP Working Group for Higgs Searches, ALEPH, DELPHI, L3 and 
OPAL Collaborations) Phys. Lett. B 565 61 

Lepage G P and Mackenzie P B 1993 Phys. Rev. 48 2250 

Leutwyler H 1996 Phys. Lett. B 378 313 

Lichtenberg D B 1970 Unitary Symmetry and Elementary Particles (New York: 
Academic) 

Lipkin H J et al. 1991 Phys. Rev. D 44 1454 

Llewellyn Smith C H 1973 Phys. Lett. B 46 233 


References 489 


Lobashev V et al. 2003 Nucl. Phys. A 719 153c 

London F 1950 Superfluids Vol I, Macroscopic theory of Superconductivity (New 
York: Wiley) 

Liischer M 1981 Nucl. Phys. B 180 317 

——1986 Commun. Math. Phys. 105 153 

——199la Nucl. Phys. B 354 531 

——1991b Nucl. Phys. B 364 237 

——1998 Phys. Lett. B 428 342 

Liischer M and Weisz P 1985 Phys. Lett. B 158 250 

Liischer M et al. 1980 Nucl. Phys. B 173 365 

Majorana E 1937 Nuovo Cimento 5 171 

Maki Z, Nakagawa M and Sakata S 1962 Prog. Theor. Phys. 28 870 

Mandelstam S 1976 Phys. Rep. C 23 245 

Mandl F 1992 Quantum Mechanics (New York: Wiley) 

Marciano W J and Sirlin A 1988 Phys. Rev. Lett. 61 1815 

Marshak R E et al 1969 Theory of Weak Interactions in Particle Physics (New York: 

Wiley) 

Martin A D et al. 1994 Phys. Rev. D 50 6734 

——2002 Eur. Phys. J. C 23 73 

Maskawa T 2009 Rev. Mod. Phys. 81 1027 

Merzbacher E 1998 Quantum Mechanics 3rd edn (New York: Wiley) 

Mikheev S P and Smirnov A Y 1985 Sov. J. Nucl. Phys. 42 913 

——1986 Nuovo Cimento 9 C 17 

Minkowski P 1977 Phys. Lett. B 67 421 

Mohapatra R N et al. 1968 Phys. Rev. Lett. 20 1081 

Mohapatra R N and Senjanovic G 1980 Phys. Rev. Lett. 44 912 

——1981 Phys. Rev. D 23 165 

Montanet L et al. 1994 Phys. Rev. D 50 1173 

Montvay I and Munster G 1994 Quantum Fields on a Lattice (Cambridge: Cam- 

bridge University Press) 

Morningstar C and Peardon M J 2004 Phys. Rev. D 69 054501 

Muta T 2010 Foundations of Quantum Chromodynamics 3rd edtn (Singapore: 

World Scientific) 

Nakamura K et al. 2010 (Particle Data Group) J.Phys. G 37 075021 

Nambu Y 1960 Phys. Rev. Lett. 4 380 

——1974 Phys. Rev. D 10 4262 

Nambu Y and Jona-Lasinio G 1961a Phys. Rev. 122 345 

——1961b Phys. Rev. 124 246 

Nambu Y and Lurie D 1962 Phys. Rev. 125 1429 

Nambu Y and Schrauner E 1962 Phys. Rev. 128 862 

Narayanan R and Neuberger H 1993a Phys. Lett. B 302 62 

——1993b Phys. Rev. Lett. 71 3251 

——1994 Nucl. Phys. B 412 574 

——1995 Nucl. Phys. B 443 305 

Nauenberg M 1999 Phys. Lett. B 447 23 

Ne’eman Y 1961 Nucl. Phys. 26 222 

Neuberger H 1998a Phys. Lett. B 417 141 

——1998b Phys. Lett. B 427 353 

Nielsen N K 1981 Am. J. Phys. 49 1171 


490 References 


Nielsen H B and Ninomaya M 1981a Nucl. Phys. B 185 20 

——1981b Nucl. Phys. B 193 173 

——1981c Nucl. Phys. B 195 541 

Nir Y 1989 Phys. Lett. B 221 184 

Noaki J et al. 2008 Phys. Rev. Lett. 101 202004 

Noether E 1918 Nachr. Ges. Wiss. Gottingen 171 

Oddone P 1989 Ann. N.Y. Acad. Sci. 578 237 

Okubo S 1962 Prog. Theor. Phys. 27 949 

Pais A 2000 The Genius of Science (Oxford: Oxford University Press) 

Pak A and Czarnecki A 2008 Phys. Rev. Lett. 100 241807 

Parry W E 1973 The Many Body Problem (Oxford: Clarendon) 

Pascoli S et al. 2007a Phys. Rev. D 75 083511 

——2007b Nucl. Phys. B 774 1 

Pauli W 1934 Rapp. Septieme Conseil Phys. Solvay, Brussels 1933 (Paris: Gautier- 
Villars), reprinted in Winter (2000) pp 7, 8 

Peccei R D and Quinn H 1977a Phys. Rev. Lett. 38 1440 

——1977b Phys. Rev. D 16 1791 

Perkins D H 1975 in Proc. Int. Symp. on Lepton and Photon Interactions at High 
Energies, Stanford, CA p 571 

Peskin M E 1997 in 1996 European School of High Energy Physics ed N Ellis and M 
Neubert CERN 97-03 (Geneva) pp 49-142 

Peskin M E and Schroeder D V 1995 An Introduction to Quantum Field Theory 
(Reading, MA: Addison-Wesley) 

Politzer H D 1973 Phys. Rev. Lett. 30 1346 

Poluektov et al. 2010 (Belle Collaboration) Phys. Rev. D 81 112002 

Pontecorvo B 1946 Chalk River Laboratory Report PD-205 

——1947 Phys. Rev. 72 246 

——1957 Zh. Eksp. Theor. Phys. 33 549 

——1958 ibid. 34 247 

—— 1967 ibid. 53 1717 (Engl. transl. Sov. Phys. JETP 26 984) 

Prescott C Y et al. 1978 Phys. Lett. B 77 347 

Puppi G 1948 Nuovo Cimento 5 505 

Quigg C 1977 Rev. Mod. Phys. 49 297 

Rajaraman R 1982 Solitons and Instantons (Amsterdam: North-Holland) 

Reines F and Cowan C 1956 Nature 178 446 

Reines F, Gurr H and Sobel H 1976 Phys. Rev. Lett. 37 315 

Renton P 1990 Electroweak Interactions (Cambridge: Cambridge University Press) 

Richardson J L 1979 Phys. Lett. B 82 272 

Rodrigo G and Santamaria A 1993 Phys. Lett. B 313 441 

Ross D A and Veltman M 1975 Nucl. Phys. B 95 135 

Rubbia C et al. 1977 Proc. Int. Neutrino Conf., Aachen, 1976 (Braunschweig: 
Vieweg) p 683 

Ryder L H 1996 Quantum Field Theory 2nd edn (Cambridge: Cambridge University 
Press) 

Sakurai J J 1958 Nuovo Cimento 7 649 

—1960 Ann. Phys., NY 11 1 

Salam A 1957 Nuovo Cimento 5 299 

—— 1968 Elementary Particle Physics ed N Svartholm (Stockholm: Almqvist and 
Wiksells) 


References 491 


Salam A and Ward J C 1964 Phys. Lett. 13 168 

Salam C P 2010 Eur. Phys. J. C 14 47 

Samuel M A and Surguladze L R 1991 Phys. Rev. Lett. 66 560 

Schael S et al. 2006 (ALEPH, DELPHI, L3, OPAL, SLD, LEP Electroweak Working 
Group, SLD Electroweak and Heavy Flavour Groups) Phys. Reports 427 257 

Schiff L I 1968 Quantum Mechanics 3rd edn (New York: McGraw-Hill) 

Schrieffer J R 1964 Theory of Superconductivity (New York: Benjamin) 

Schutz B F 1988 A First Course in General Relativity (Cambridge: Cambridge 
University Press) 

Schwinger J 1957 Ann. Phys., NY 2 407 

——1962 Phys. Rev. 125 397 

Shaevitz M H et al. 1995 (CCFR Collaboration) Nucl. Phys. B Proc. Suppl. 38 188 

Sharpe S R 2006 PoSLAT 022 (hep-lat /0610094) 

Shaw R 1995 The problem of particle types and other contributions to the theory 
of elementary particles PhD Thesis University of Cambridge 

Sikivie P et al. 1980 Nucl. Phys. B 173 189 

Sirlin A 1980 Phys. Rev. D 22 971 

——1984 Phys. Rev. D 29 89 

Slavnov A A 1972 Teor. Mat. Fiz. 10 153 (Engl. transl. Theor. and Math. Phys. 10 
99) 

Snyder A E and Quinn H R 1993 Phys. Rev. D 48 2139 

Sommer R 1994 Nucl. Phys. B 411 839 

Spergel D et al. 2007 Astrophys. J. Supp. 170 377 

Staff of the CERN pp project 1981 Phys. Lett. B 107 306 

Stange A et al. 1994a Phys. Rev. D 49 1354 

——1994b Phys. Rev. D 50 4491 

Staric M et al. 2007 (Belle Collaboration) Phys. Rev. Lett. 98 211803 

Steinberger J 1949 Phys. Rev. 76 1180 

Sterman G and Weinberg S 1977 Phys. Rev. Lett. 39 1436 

Stueckelberg E C G and Peterman A 1953 Helv. Phys. Acta 26 499 

Sudarshan E C G and Marshak R E 1958 Phys. Rev. 109 1860 

Susskind L 1977 Phys. Rev. D 16 3031 

——1979 Phys. Rev. D 19 2619 

Sutherland D G 1967 Nucl. Phys. B 2 433 

Symanzik K 1970 Commun. Math. Phys. 18 227 

——1983 Nucl. Phys. B 226 187, 205 

Tarasov O V et al. 1980 Phys. Lett. B 93 429 

Tavkhelidze A 1965 Seminar on High Energy Physics and Elementary Particles 
(Vienna: IAEA) p 763 

Taylor J C 1971 Nucl. Phys. B 33 436 

— 1976 Gauge Theories of Weak Interactions (Cambridge: Cambridge University 
Press) 

"+ Hooft G 1971a Nucl. Phys. B 33 173 

——1971b Nucl. Phys. B 35 167 

——1971c Phys. Lett. B 37 195 

——1976a Phys. Rev. D 14 3432 

—1976b High Energy Physics, Proc. European Physical Society Int. Conf. ed A 
Zichichi (Bologna: Editrice Composition) p 1225 


492 References 


——1980 Recent Developments in Gauge Theories, Cargese Summer Institute 1979 
ed G ’t Hooft et al. (New York: Plenum) 

——1986 Phys. Rep. 142 357 

’t Hooft G and Veltman M 1972 Nucl. Phys. B 44 189 Rev. Mod. Phys. 21 153 

Tiomno J and Wheeler J A 1949 Rev. Mod. Phys. 21 153 

Ur A C et al. (GERDA Collaboration) 2011 Nucl. Phys. Proc. Suppl. 217 38 

Valatin J G 1958 Nuovo Cimento 7 843 

van der Bij J J 1984 Nucl. Phys. B 248 141 

van der Bij J J and Veltman M 1984 Nucl. Phys. B 231 205 

van der Neerven W L and Zijlstra E B 1992 Nucl. Phys. B 382 11 

van Ritbergen T and Stuart R G 1999 Phys. Rev. Lett. 82 488 

van Ritbergen et al. 1997 Phys. Lett. B 400 379 

Veltman M 1967 Proc. R. Soc. A 301 107 

——1968 Nucl. Phys. B 7 637 

—1970 Nucl. Phys. B 21 288 

—1977 Acta Phys. Polon. B 8 475 

von Weiszăcker C F 1934 Z. Phys. 88 612 

Wegner F 1972 Phys. Rev. B 5 4529 

Weinberg S 1966 Phys. Rev. Lett. 17 616 

——1967 Phys. Rev. Lett. 19 1264 

——1973 Phys. Rev. D 8 605, especially footnote 8 

——1975 Phys. Rev. D 11 3583 

——1978 Phys. Rev. Lett. 40 223 

—1979a Physica A 96 327 

——1979b Phys. Rev. D 19 1277 

—— 1996 The Quantum Theory of Fields Vol II Modern Applications (Cambridge: 
Cambridge University Press) 

Weisberger W 1965 Phys. Rev. Lett. 14 1047 

Weisskopf V F and Wigner E P 1930a Z. Phys. 63 54 

——1930b Z. Phys. 65 18 

Wilczek F 1978 Phys. Rev. Lett. 40 279 

Williams E J 1934 Phys. Rev. 45 729 

Wilson K G 1969 Phys. Rev. 179 1499 

——197la Phys. Rev. B 4 3174 

——1971b Phys. Rev. B 4 3184 

——1974 Phys. Rev. D 10 2445 

——1975 New Phenomena in Subnuclear Physics, Proc. 1975 Int. School on Subnu- 
clear Physics Ettore Majorana ed A Zichichi (New York: Plenum) 

Wilson K G and Kogut J 1974 Phys. Rep. 12C 75 

Winter K 2000 Neutrino Physics 2nd edn (Cambridge: Cambridge University Press) 

Wolfenstein L 1978 Phys. Rev. D 17 2369 

——1983 Phys. Rev. Lett. 51 1945 

Wu C S et al. 1957 Phys. Rev. 105 1413 

Yanagida T 1979 Proc. Workshop on Unified Theory and Baryon Number in the 
Universe ed O Sawada and A Sugamoto (Tsukuba: KEK) 

Yang C N 1950 Phys. Rev. 77 242 

Yang C N and Mills R L 1954 Phys. Rev. 96 191 

Yosida K 1958 Phys. Rev. 111 1255 


Index 


Adjoint representation, see Representa- 
tion, adjoint 
Adler sum rule, 316 
Adler—Weisberger relation, 228, 231 
Aharonov-Bohm effect, 462 
Algebra, Lie 
Lorentz group, 442, 448 
oi ), 439 
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Anomalous dimension, 132, 179 
Anomalous mass dimension, 135, 146 
Anomaly, chiral, 249-253 
cancellation of, in a gauged current, 
252-253 
Anti-screening, in QCD, 125-126 
Antiparticles 
isospin representation for, 15-16 
SU(3)f representation for, 20-21 
Asymmetric collider, 335 
Asymmetries 
Af, 392, 410 
Ars, 391, 410 
Arr, 392 
AFR, 392 
CP-violating 
Aş, 337 
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AsL, 337 
Asymptotic freedom, 73, 86, 115, 124- 
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Asymptotic scaling, 184 


Auxiliary field, 61-62 
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Baryon asymmetry of the universe, 353 
Baryon number, conservation of, 7, 25 
Basis vectors, changes in, for polar coor- 
dinates, 454-455 
BCS 
ground state for superconductor, 199, 
221 
not a number operator eigenstate, 
221-225 
Hamiltonian, 220 
theory, 219-225, 260-261, 266 
6-function 
of QCD, 124-127 
of QED, 122 
zero of (‘fixed point’), 132-133 
Bjorken limit, 309 
and scaling, 309-310, 327 
Bloch—Nordsieck theorem, 105 
Bogoliubov 
canonical transformation, 205, 221 
ground state, 207-209 
non-vanishing vev in, 208 
not a number operator eigenstate, 
204-205, 207 
quasi-particle operators, 205, 221 
superfluid, 199, 202-209 
B°-B? oscillations, 323, 335-345 


Cabibbo 

angle, 304-306 

hypothesis, 304-305 
Callan-Gross relation, 136, 328 
Callan-Symanzik equation, 119 
Casimir operator, 81, 444 

Lorentz group, 448 
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Charge 
chiral, in SU(2)s5, 36, 232 
independence of nuclear forces, 5, 
13 
multiplets, in nuclear physics, 12-14 
quantization of, 47 
Charge-conjugation symmetry 
conserved in QCD, 84 
in SU(2), 16, 404 
violated in weak interactions, 292 
Charged current process, 298 
Chiral anomaly, see Anomaly, chiral 
Chiral perturbation theory, 187, 245-248 
and meson mass relations, 247-248 
and pion-pion scattering, 245-246 
and quark masses, 248 
Chiral symmetry breaking, see Symme- 
try, chiral, breaking 
Chirality, 32, 230 
for antifermions, 32 
current, 33 
operator, 34 
projection operators, 33, 233, 289 
CKM matrix, 236, 319-323, 329, 409-410 
CP-violating phase in, 319 
plot of constraints on, 323, 345 
unitarity of, 320-321 
Wolfenstein parametrization of, 321- 
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Coarse graining, 174-175 
Coherence length, in a superconductor, 
225, 268 
Collinear divergence, 101-102, 104, 140 
Colour, 4 
degree of freedom, 74-77 
factor, 107, 109-111 
and T° — yy, 77, 252 
singlet, 75 
as an SU(3) group, 78-80 
Complex conjugate representation 
in SU(2), 15-16 
in SU(3), 20-21 
Compton cross section, virtual, 138 
Condensate, 204, 220, 222, 229, 261 
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Nambu, 229-230 
monopole, 271 
superconductor, 220, 222, 261 
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Connection 
curved space, 457-458 
gauge, 459 
Cooper pair, 220, 261, 270 
Correlation length, 172 
Covariant derivative, 41-47, 158 
and coordinate transformations, 453— 
459 
in lattice field theory, 158-159 
in SU(2), 40-47 
in SU(3), 49-50, 83 
in SU(2) x U(1), 275, 382 
in U(1), 41 
CP problem, strong, 86 
CP symmetry, 292 
CP violation, 292, 410 
angle a(2), 322, 340-345 
angle (1), 322, 338-340 
angle (03), 322, 331-335, 343 
in B meson oscillations, 335-345 
in D decays, 349-350 
direct, in B decays, 330-335 
in K decays, 345-349 
in neutrino oscillations, 353, 364 
not possible with two SM genera- 
tions, 317-319 
parameters e and e”, 348-349 
possible with three SM generations, 
319 
Critical exponent, 132, 179 
Critical point, 172 
and fixed point of renormalization 
group transformation, 176-177 
Critical temperature, 202 
Cross section for 
e*+e” — hadrons in parton model, 
73-74 
QCD corrections, 113-115, 128- 
130 
et +e” > pi or 77, 390-392 
v +N > u + X, 308-313 
D +N > + X, 308-313 
De+e — De +e” , 390 
Vu +e —> bw + Ve, 298-302, 326- 
327, 389 
Vu +E —> Vu +e , 389 
Va +N > p` +X, 308 
Vu +p>p +X, 327-328 


Hite  — Da +e , 389-390 
Da +p—> pt +X, 311 
W=, Z production in pp collider, 
393-394 
Curie temperature, 200 
Current 
Cabibbo, 306 
chirality, 33-34 
dynamical, in U(1), 28 
GIM, 307 
symmetry 
in spontaneously broken symme- 
try, 216, 238, 244 
in SU(2)s, 27-30, 232 
in SU(2)s5, 35-37, 232, 249 
in SU(3)r, 30 
in SU(2)r x SU(2)r5, 232, 243 
in SU(2)L x SU(2)r, 233, 243 
weak, 37, 227 
leptonic charged, 296-297, 383- 
384 
leptonic neutral, 302-304, 385-386 
quark, 304-308, 317-324, 386-387 
Current-current theory, 284-287 
difficulties with, 367-369 
for leptons, 296-298 
relation to GSW theory, 383-384 
Curvature, 460-461 
and gauge field strength tensor, 459— 
462 
tensor, Reimann, 461 
Custodial SU(2), 415-418 


Dalitz plot, 101, 334, 343-345 

Decoupling, of massive particles, 415-416 

Deep inelastic scattering, neutrino, 308— 
317 

Degenerate ground state, 199-202, 213 

AS = AQ rule, 305-307 

|AS| = 1 rule, 305-307 

DGLAP equation, 143, 147 

Dimensional regularization, see Regular- 
ization, dimensional 

Dimensions, large extra, 422 

Dirac charge quantization condition, 270- 
271 

Discretization, in lattice field theory, 152- 
155, 158-161 

Dirac fields, 154-158 
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gauge fields, 158-161 
scalar fields, 152-154 
Distribution function 
evolution of, 143-144, 314-315 
for xF, 315 
gluon, 94-95, 143, 147 
parton, 93, 144 
quark and antiquark, 94, 142-147 
Drell-Yan process, 76, 87, 89 
Effective Hamiltonian for B-B mixing, 336 
Effective interaction, 173 
weak current-current, 297, 305 
weak lepton-quark, 308 
weak quark-quark, 324 
Effective Lagrangian, see Lagrangian, ef- 
fective 
Effective theory, 228 
heavy quark, 325 
Electroweak theory, 4, 377-431 
Energy gap, in a superconductor, 219, 
222, 228, 230 
Equivalent photon approximation, 141 
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and spontaneous breaking of SU(3)s5, 
232 
Euclidean 
Dirac matrices, 155 
four-momentum, 463 
space-time, 153, 164, 171 
Euler—Mascheroni constant, 464 


Fabri-Picasso theorem, 197-199, 209 
Factorization, 88, 142-145 
scale, 142 
Fermi 
coupling constant, 236, 297 
current-current theory, 284-287 
connection with IVB model, 373 
energy, 220, 230 
momentum, 225 
Fermion 
classically perfect, 158 
determinant, 169-170, 470-471 
domain wall, 158 
doubling problem, 155 
mass generation by Higgs coupling, 
403, 421 
Majorana case, 405-407 
mass problem, 401-403 
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Fermion (contintued) 
overlap, 158, 247 
propagator, in lattice field theory, 
155-157 
staggered, 157, 170, 187 
Wilson, 155-157, 190 
Ferromagnet, 199-202, 213-214 
Feynman 
rules for tree graphs 
in electroweak theory, 474-479 
in QCD, 473-474 
sum over paths formulation of quan- 
tum theory, 162-170 
fermions in, 169-170, 467-471 
gauge-fixing in, 69 
propagators in, 167 
Field strength tensor, see Tensor, field 
strength 
Fine-tuning, of Higgs mass, 182, 422 
Finite-size effects, 182 
Fixed point (in RGE), 133, 176-177 
infrared stable, 133, 178 
infrared unstable, 178 
of renormalization transformation, 
176 
ultraviolet stable, 133 
and zero of 6 function, 133 
Flavours, number of active, 309, 315 
Flavour tagging, 335-338, 347 
Flux quantization, in a superconductor, 
267-270 
Four-fermion interaction, 286 
effective (non-leptonic), 324 
not renormalizable, 297, 369, 374 
and violation of unitarity, 368-369 
Fundamental representation, see Repre- 
sentation, fundamental 
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R, 274 
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255, 373, 377 
SU(2), 40-49, 51-53 
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non-Abelian, 3, 42-45, 49-53, 57, 
276 
U, 274, 276 
Gell-Mann matrices, 19, 442-443 
Gell-Mann-Nishijima relation, 20, 380 
Gell-Mann—Okubo formula, 248 
Generations 
and anomaly cancellation, 253 
three, mixing in, 317-324, 407-410 
two, mixing in, 306-308, 318-319 
Generators, 437-438 
Lorentz group, 441-442 
SO(3), 438-439 
SO(4), 440-441 
SU(2), 9, 439-440 
in quantum field theory, 26 
SU(3), 19, 442-443 
in quantum field theory, 30 
SU(2) x SU(2), 441 
in quantum field theory, 232-233 
U(1), in quantum field theory, 25 
Gr, 294, 297, 305-306, 362, 368-370, 374, 
383-384, 412 
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Gluon fusion, 425-426 
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Gluons, 50 
charge conjugation for, 84 
and nucleon momentum fraction, 313 
scalar, 80-82 
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Goldberger—Treiman relation, 228, 230, 
235-239 
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boson, 218, 227, 232, 240, 247 
model, 198, 211-216 
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QCD corrections to, 314-315 
Group 
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definition, 435 
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algebra of, 438 
compact, 447-448 
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structure constants of, 14, 438, 
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GSW (Glashow-Salam-Weinberg) theory, 
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higher-order corrections in, 410-419 
tree-level predictions in, 387-392 
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picture, 162 
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of lepton in m decay, 236-237 
suppression in V-A theory, 291, 293- 
295 
Hierarchy problem, 422 
Higgs 
boson, 267, 277, 381 
decay, 427-429 
fermion couplings, 403, 405, 478 
mass, 381, 411, 414, 423-425, 430- 
431 
probable discovery, 429-431 
production, 425-427 
self-coupling, 421 
tree-level mass, 381 
and vacuum stability, 425 
doublet, and p = 1, 414, 432-433 
field, 264-267, 271, 380, 421 
and fermion mass generation, 278— 
279, 403-407 
unphysical, 274, 278 
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384, 403 
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model 
Abelian, 264-278 
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sector of Standard Model, 216, 420- 
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Integrating out degrees of freedom, 173- 
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370-377 
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analysis 
in B meson oscillations, 342-345 
in K decays, 348 
for antiparticles, 15-16 
in nuclear physics, 3, 5-6, 12-14 
in particle physics, 14-18 
weak, 4, 40-41, 47, 378-382 
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Goldstone model, 211-213 
T-N vertex in, 30, 238 
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quantum field theory 
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SU(2)x U(1) Higgs, 275-278 
symmetry-breaking, 212-213, 217, 
264, 275 
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U(1) Higgs, 264 
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Yukawa (fermion mass-generating), 
403-405 
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Aus, 466 
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Landau theory, of second-order phase tran- 
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Large logarithms, 116-118 
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and chiral symmetry breaking, 156- 
158 
and gauge invariance, 158-161 
regularization in, 151-152 
renormalization in, and RGE, 172- 
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Leading log corrections, 123 
Leptogenesis, 353 
Lepton flavour, 295, 303, 352 
Lepton number, 293-296, 406 
Lepton tensor, see Tensor, lepton 
Lie algebra, see Algebra, Lie 
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London equation, 262 
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266, 372 
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266 
scattering of, 372-373 
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Mass diagonal basis, 409 
Mass generation 
for fermions, via chiral symmetry 
breaking, 228-231, 279 
for gauge bosons, via gauge sym- 
metry breaking, 195, 223, 255, 
259-260, 266, 276-277 
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Mass scale (u), 114, 118-123 
Mass singularity, 102, 140 
uncancelled, 136-140 
Massless mode 
made massive by Coulomb interac- 
tion, 207 
when symmetry spontaneously bro- 
ken, 206, 210-211, 215, 219, 228— 
229 
Maxwell 
action, 49, 264 
in lattice field theory, 159-161 
tensor, 48 
as curvature, 159-160, 462 
Meissner effect, 260 
Mellin transform, 149 
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303, 381, 385-387, 392, 400, 412 
in MS renormalization scheme, 412 
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Mixing formalism 
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MSW effect, 362-364 
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angle 063, 357, 359, 364 
angle 0,3, 357-366 
resonant, 362 
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discovery, 285 
Nielsen—Ninomaya theorem, 157 
Noether’s theorem, 28, 249 
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introduced via local phase invariance, 
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quantization of, 60-71 
Non-decoupling, spontaneous symmetry- 
breaking case, 414 
Non-leptonic weak interactions, 324-325 
Non-renormalizability in weak interactions 
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for four-fermion and IVB theories, 
374-377 
Non-renormalizable interactions 
and irrelevant interactions, 180-181 
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Parallel transport, 457-459 
in U(1) gauge field case, 458 
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and chirality current, 34 
and chirality operator, 34 
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doublets, in chiral symmetry, 35-37, 
195, 227 
violation 
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Phase transitions, 172 
Landau theory of, 202 
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ductor, 260-264 
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decay constant, 235-237, 244, 327 
massless, 217 
mass term, 245-246 
Pion-pion scattering 
in chiral perturbation theory, 245- 
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and non-linear o-model, 243-244 
n° — yy, 249-252 
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sum over 
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unitarity, 62-68 
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240, 264, 275, 380 
Propagator 
gluon, 474 
Higgs, 475 
massive vector boson, 258, 284, 475 
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try, 274, 278, 377 
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Pseudoscalar, 288 
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and SU(2) gauge invariance, 54— 
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Compton scattering in, 57 
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Quantization of non-Abelian gauge fields, 
see Non-Abelian gauge fields, 
quantization of 
Quark flavour mixing 
three generations, 320-324, 386, 407— 
410 
two generations, 306-308, 318-319 
Quark masses, running, 134-135 
Quasi-particle operators 
in a superconductor, 221 
in a superfluid, 205 
Quenched approximation, 170 
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proof of, 84, 377 
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on the lattice, 172-182 
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transformation, 176 
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for antiparticles 
in SU(2)f, 15-16 
in SU(3)f, 20-21 
basis for, 18 
fundamental, 445 
of SO(3), 445 
of SU(2), 9, 445 
of SU(3), 20, 445 
of generators, in quantum field the- 
ory 
for SU(2), 26-29 
for SU(3), 30 
irreducible, of SU(2), 14 
matrix, of Lie algebras and groups, 
14, 443-446 
octet, in SU(3), 21-24, 83 
regular, 24, 44, 50, 446 
self, 446 
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in SU(2), 16-17 
in SU(3), 22 
triplet 
in SU(2), 17-18 
in SU(3), 20 
Resolution parameters, 105 
R gauges, 274 
p parameter, 386, 414-416, 433 
Rooting, 170, 187 
Running coupling constant, 103-104, 115, 
123-129 
Running mass, 134-135 
Rutherford process, 92-93 


Scaling, 135 
violations, 140, 314 
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148 
in ete” — hadrons, 114-115 
Scattering, hard, and QCD tree graphs, 
86-97 
Screening current, 262-263, 268 
Screening, in Higgs sector, 418-419 
Screening length, 262-263 
Schrödinger picture, 162 
Sea quarks, 89-90 
See-saw mechanism, 407 
Semi-leptonic processes, 305 
o-model 
linear, 239-242 
non-linear, 243-245 
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Soft divergences, 101-102, 104 
Soft pion process, 231 
Spin waves, 201-202 
Splitting function, 141, 144 
Spontaneously broken symmetry, see Sym- 
metry, spontaneously broken 
Subsidiary condition, 61 
Superconductor, BCS, 219-225, 260-261 
generation of photon mass in, 260- 
263 
type I, 268 
type II, 268 
Superfluid, Bogoliubov, see Bogoliubov, 
superfluid 
Supersymmetry, 422 
Symanzik improved action, 182, 187, 190 
Symmetry 
chiral, 31-37, 53 
breaking, 37, 227-248 
in lattice field theory, 156-158 
current, see Current, symmetry 
hadronic isospin, 12-18 
non-Abelian global, 3-38 
in quantum field theory, 24-38 
non-Abelian local, 39-70 
in quantum field theory, 51-70 
SO(3), 438-439, 450-452 
irreducible representations of, 14 
Lie algebra of, 439 
relation with SU(2), 450-452 
structure constants of, 439 
triplet, 17-18 
SO(4), 240, 440-441 
spontaneously broken, 37, 195-279 
chiral, 3-4, 227-253 
in condensed matter physics, 198— 
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global, 195-226 
global Abelian, 211-216 
global non-Abelian, 216-219 
local, 255-279 
local Abelian, 264-274 
local non-Abelian, 275-279 
mass generation via, 5, 228-230, 
255, 260-264, 275-279 
and non-zero vev, 209-210 
SU(2), 8, 439-440 
chiral, 35-36 
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Lie algebra of, 14, 440 
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relation with SO(3), 450-452 
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structure constants of, 14 
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SU(2)15, 231 
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global, 3, 18-24 
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octet, 21-24, 50-51 
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covariant derivative as, 456 
field strength 
SU(2), 48-49 
SU(3), 51 
U(1), 48 
hadron, 99-100 
weak, 309 
lepton, 99-100 
in V-A theory, 299-300, 309 
Riemann curvature, 461 
0-term, 84-86 
Ow, see Mixing angle, weak (Ow) 
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Three-jet events 
in ete” annihilation, 97-103 
in pp collisions, 95-97 
Time reversal 
symmetry in QCD, 84 
Top quark, 419-420 
virtual, effect of, 413-414, 419-420 
Transition temperature 
ferromagnetic, 214 
superconducting, 261, 263 
superfluid, 203 
Transverse mass, 399 
Two-jet events 
in ete” annihilation, definition of, 
103-106 
in pp collisions, 88-95 


U gauges, 274, 276, 278, 380, 474-475 
U(1), 24-25, 27-28, 35, 40, 51, 66, 203, 
212, 217, 229, 231, 264, 275, 
380 
problem, 232 
U(1)r, 231 
U(1)r5, 34, 231-232 
Unitarity 
relation, 62 
triangle, in CP violation, 320-322 
plot of constraints on, 323, 345 
violation 
in current-current models, 368- 
369 
due to explicit fermion mass, 401— 
403 
due to unphysical degrees of free- 
dom, 62-67 
and ghosts, 68-69 
in IVB model, 371-373 
in scattering of longitudinal W’s, 
371-373 
Universality 
of coupling strengths, in a gauge the- 
ory, 47 
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and renormalizable effective theory, 
181 
in weak interactions, 255, 296-297, 
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analogous to many-body ground 
state, 199 
expectation value (vev), 208, 210- 
211, 218, 230 
not invariant under symmetry, 196 
and spontaneous symmetry break- 
ing, 197-199, 208, 215, 218, 230 
Vacuum polarization, 125, 170 
Vector 
axial, 288 
polar, 288 
Vector boson fusion, 426 
Vector particle 
massive, 255-259 
propagator, 256, 475 
massless, 259 
propagator, 259, 274, 474 
Vertex 
four-gluon, 474 
four-W, 477 
four-X, 59-60 
Higgs-fermion, 403-404, 408-409, 
478 
Higgs self-coupling, 479 
Higgs-vector boson, 478-479 
isospinor-W, 46 
lepton-W, 382-384, 476 
lepton-Z, 385, 476 
quark-gluon, 53, 474 
three-gluon, 474 
three-W, 477 
three-X, 55, 59-60 
X-X-y, 58-5-9 
X-X-y-y, 58-59 
V-A theory, 237, 288-292 
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Vortex, superconducting, 268 
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discovery, 395-401 
mass, 381, 388, 397-399, 410 
radiatively corrected, 400, 412 
tree-level, 388 
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W-boson (continued) 


production cross section in pp col- 


lider, 393-394 


scattering of longitudinally polarized, 


424-425 
Weak interaction basis, 409 
Weak isospin, see Isospin, weak 
Weyl equation, 292 
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Wilson loop, 187 

W-propagator 
massive, 274, 377 
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407 


Z° boson, 277, 381, 385 
decay width, 388-389, 397 
and number of light neutrinos, 397 
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